How to: View Spark History logs locally
Spark History logs are very valuable when you are trying to analyze the stats of a specific job. When in cases where you are working with a large cluster , where multiple users are executing jobs or when you have an ephemeral cluster and you want to retain your logs for analysis in future, here’s a way to do it locally.
How do you analyze the spark history logs locally?
Download logs from Spark History Server
The most important thing here and the first step is to download the spark-history logs from the UI before your cluster goes down.
Setup Spark History Server Locally
Below steps would help in setting the history server locally and analyze the logs.
- On a MacOs :
brew install apache-spark
- Create a directory for the logs.
- Move the downloaded logs in the previous step to the logs directory and unpack them.
- Create a file named
- Inside log.properties, add
spark.history.fs.logDirectory=<path to the spark-logs directory>
- Navigate to
sh start-history-server.sh --properties-file <path to log.properties>
- Navigate to http://localhost:18080 on browser.
Now you can view and analyze the logs locally.