Monitoring Spark Applications

Monitoring Spark ApplicationsTzach Zohar @ Kenshoo, March/2016

Who am ISystem Architect @ Kenshoo

Java backend for 10 years

Working with Scala + Spark for 2 years

https://www.linkedin.com/in/tzachzohar

https://www.linkedin.com/in/tzachzohar

Who’s Kenshoo10-year Tel-Aviv based startup

Industry Leader in Digital Marketing

500+ employees

Heavy data shop

http://kenshoo.com/

http://kenshoo.com/

And who’re you?

AgendaWhy Monitor

Spark UI

Spark REST API

Spark Metric Sinks

Applicative Metrics

The Importance of being Earnest

Why MonitorFailures

Performance

Know your data

Correctness of output

Monitoring Distributed SystemsNo single log file

No single User Interface

Often - no single framework (e.g. Spark + YARN + HDFS…)

Spark UI

Spark UISee http://spark.apache.org/docs/latest/monitoring.html#web-interfaces

The first go-to tool for understanding what’s what

Created per SparkContext

http://spark.apache.org/docs/latest/monitoring.html#web-interfaces

Spark UIJobs -> Stages -> Tasks


Spark UI Use the “DAG Visualization” in Job Details to:

Understand flow

Detect caching opportunities


Detect unbalanced stages

Detect GC issues

Spark UIJobs -> Stages -> Tasks -> “Event Timeline”

Detect stragglers

Detect repartitioning opportunities

Spark UI Disadvantages“Ad-Hoc”, no history*

Human readable, but not machine readable

Data points, not data trends

Spark UI Disadvantages

UI can quickly become hard to use…

Spark REST API

Spark’s REST APISee http://spark.apache.org/docs/latest/monitoring.html#rest-api

Programmatic access to UI’s data (jobs, stages, tasks, executors, storage…)

Useful for aggregations over similar jobs

http://spark.apache.org/docs/latest/monitoring.html#rest-api

Spark’s REST APIExample: calculate total shuffle statistics: object SparkAppStats { case class SparkStage(name: String, shuffleWriteBytes: Long, memoryBytesSpilled: Long, diskBytesSpilled: Long) implicit val formats = DefaultFormats val url = "http://<host>:4040/api/v1/applications/<app-name>/stages"

def main (args: Array[String]) { val json = fromURL(url).mkString val stages: List[SparkStage] = parse(json).extract[List[SparkStage]] println("stages count: " + stages.size) println("shuffleWriteBytes: " + stages.map(_.shuffleWriteBytes).sum) println("memoryBytesSpilled: " + stages.map(_.memoryBytesSpilled).sum) println("diskBytesSpilled: " + stages.map(_.diskBytesSpilled).sum) }}

https://gist.github.com/tzachz/a7ad56767f961c289cb2#file-sparkappstats-scala

Example: calculate total shuffle statistics:

Example output:

stages count: 1435

shuffleWriteBytes: 8488622429

memoryBytesSpilled: 120107947855

diskBytesSpilled: 1505616236

Spark’s REST API

https://gist.github.com/tzachz/a7ad56767f961c289cb2#file-sparkappstats-scala

Spark’s REST APIExample: calculate total time per job name: val url = "http://<host>:4040/api/v1/applications/<app-name>/jobs"

case class SparkJob(jobId: Int, name: String, submissionTime: Date, completionTime: Option[Date], stageIds: List[Int]) { def getDurationMillis: Option[Long] = completionTime.map(_.getTime - submissionTime.getTime) } def main (args: Array[String]) { val json = fromURL(url).mkString parse(json) .extract[List[SparkJob]] .filter(j => j.getDurationMillis.isDefined) // only completed jobs .groupBy(_.name) .mapValues(list => (list.map(_.getDurationMillis.get).sum, list.size)) .foreach { case (name, (time, count)) => println(s"TIME: $time\tAVG: ${time / count}\tNAME: $name") } }

https://gist.github.com/tzachz/2e484273eaa85291c7a9

Spark’s REST APIExample: calculate total time per job name:

Example output:

TIME: 182570 AVG: 16597 NAME: count at

MyAggregationService.scala:132

TIME: 230973 AVG: 1297 NAME: parquet at MyRepository.scala:99

TIME: 120393 AVG: 2188 NAME: collect at MyCollector.scala:30

TIME: 5645 AVG: 627 NAME: collect at MyCollector.scala:103

https://gist.github.com/tzachz/2e484273eaa85291c7a9

But that’s still ad-hoc, right?

Spark Metric Sinks

Metrics: easy Java API for creating and updating metrics stored in memory, e.g.:

MetricsSee http://spark.apache.org/docs/latest/monitoring.html#metrics

Spark uses the popular dropwizard.metrics library (renamed from codahale.metrics and yammer.metrics)

// Gauge for executor thread pool's actively executing task countsmetricRegistry.register(name("threadpool", "activeTasks"), new Gauge[Int] { override def getValue: Int = threadPool.getActiveCount()})

https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala#L46

http://spark.apache.org/docs/latest/monitoring.html#metrics

https://dropwizard.github.io/metrics/3.1.0/

MetricsWhat is metered? Couldn’t find any detailed documentation of this

This trick flushes most of them out: search sources for “metricRegistry.register”

https://github.com/apache/spark/search?utf8=%E2%9C%93&q=metricRegistry.register&type=Code

Where do these metrics go?

Spark Metric SinksA “Sink” is an interface for viewing these metrics, at given intervals or ad-hoc

Available sinks: Console, CSV, SLF4J, Servlet, JMX, Graphite, Ganglia*

we use the Graphite Sink to send all metrics to Graphite

$SPARK_HOME/metrics.properties:*.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink*.sink.graphite.host=<your graphite hostname>*.sink.graphite.port=2003*.sink.graphite.period=30*.sink.graphite.unit=seconds*.sink.graphite.prefix=<token>.<app-name>.<host-name>

.. and it’s in Graphite ( + Grafana)

Graphite SinkVery useful for trend analysis

WARNING: Not suitable for short-running applications (will pollute graphite with new metrics for each application)

Requires some Graphite tricks to get clear readings (wildcards, sums, derivatives, etc.)

Applicative Metrics

The Missing PieceSpark meters its internals pretty thoroughly, but what about your internals?

Applicative metrics are a great tool for knowing your data and verifying output correctness

We use Dropwizard Metrics + Graphite for this too (everywhere)

Counting RDD Elementsrdd.count() might be costly (another action)

Spark Accumulators are a good alternative

Trick: send accumulator results to Graphite, using “Counter-backed Accumulators”/** * * Call returned callback after acting on returned RDD to get counter updated */ def countSilently[V: ClassTag](rdd: RDD[V], metricName: String, clazz: Class[_]): (RDD[V], Unit => Unit) = { val counter: Counter = Metrics.newCounter(new MetricName(clazz, metricName)) val accumulator: Accumulator[Long] = rdd.sparkContext.accumulator(0, metricName) val countedRdd = rdd.map(v => { accumulator += 1; v }) val callback: Unit => Unit = u => counter.inc(accumulator.value) (countedRdd, callback) }

http://spark.apache.org/docs/latest/programming-guide.html#accumulators-a-nameaccumlinka

https://gist.github.com/tzachz/0b0a0e6ea3bfddb36557

Counting RDD Elements

We Measure...Input records

Output records

Parsing failures

Average job time

Data “freshness” histogram

Much much more...

WARNING: it’s addictive...

ConclusionsSpark provides a wide variety of monitoring options

Each one should be used when appropriate - neither one is sufficient on its own

Metrics + Graphite + Grafana can give you visibility to any numeric timeseries

Questions?

Thank you

Software

Monitoring Spark Applications