How to view Spark History logs locally

By | June 24, 2019

How to: View Spark History logs locally

Spark History logs are very valuable when you are trying to analyze the stats of a specific job. When in cases where you are working with a large cluster , where multiple users are executing jobs or when you have an ephemeral cluster and you want to retain your logs for analysis in future, here’s a way to do it locally.

How do you analyze the spark history logs locally?

Download logs from Spark History Server

The most important thing here and the first step is to download the spark-history logs from the UI before your cluster goes down.

Spark History Server

Spark History Server

Setup Spark History Server Locally

Below steps would help in setting the history server locally and analyze the logs.

  1. On a MacOs : brew install apache-spark
  2. Create a directory for the logs.
  3. Move the downloaded logs in the previous step to the logs directory and unpack them.
  4. Create a file named log.properties
  5. Inside log.properties, add spark.history.fs.logDirectory=<path to the spark-logs directory>
  6. Navigate to /usr/local/Cellar/apache-spark/<version>/libexec/sbin
  7. Execute sh start-history-server.sh --properties-file <path to log.properties>
  8. Navigate to http://localhost:18080 on browser.

Now you can view and analyze the logs locally.

READ  List all files in AWS S3 using python boto

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.