Accessing Spark 2 Web UIs from Cloudera Data Science Workbench

Spark 2 provides a web interface for running Spark applications to monitor and track progress of executions in real time. There will be one web UI for each Spark application driver running in Cloudera Data Science Workbench. To access this page, point your browser to spark-<session_ID>.<>. session_ID is the ID of the Cloudera Data Science Workbench session running the Spark application, and is the domain of the Cloudera Data Science Workbench instance you are accessing.

The session_ID is the alphanumeric string at the end of a session URL. For example if the URL of a running session is, then the Web UI for this session's Spark driver will be at,

You can also use environmental variables to generate the HTML link for the session's web UI.




import os, IPython
url = "spark-%s.%s" % (os.environ["CDSW_ENGINE_ID"], os.environ["CDSW_DOMAIN"])
IPython.display.HTML("<a href=http://%s>Spark UI</a>" % url)


url = paste("spark-", Sys.getenv("CDSW_ENGINE_ID"), ".", Sys.getenv("CDSW_DOMAIN"), sep="")
html(paste("<a href=http://", url, ">Spark UI</a>", sep=""))


val id = sys.env("CDSW_ENGINE_ID")
val domain = sys.env("CDSW_DOMAIN")
val url = s"http://spark-$id.$domain"

Spark History Server

Spark 2 also provides a UI that displays information and logs for completed Spark applications, which is useful for debugging and performance monitoring. This UI, called the History Server, runs on the CDH cluster, on a configurable node and port. You can learn more about using the Spark History Server in the Apache Spark 2 monitoring documentation.

Cloudera Data Science Workbench gives you a way to access the Spark History Server from within the Cloudera Data Science Workbench application. Click in the upper right hand corner of the Cloudera Data Science Workbench web application, and select Spark History from the dropdown menu.