Advancing Spark - Understanding the Spark UI

preview_player
Показать описание
When we first started with Spark, the Spark UI pages were something of a mystery, an arcane source of mysterious, hidden knowledge. Looking back, it's something that is so, so useful for understanding your spark cluster, diagnosing user issues and deciding whether your cluster is correctly sized.

In this video, Simon gives a quick tour of the Spark UI, talking through the various tabs and the kind of troubleshooting/information they can provide.

Рекомендации по теме
Комментарии
Автор

One of the best explanation of spark UI, extremely helpful. Thank you Simon

sahilchitnis
Автор

Incredibly useful. Appreciate the way it is explained. I suggest you pick an use case and resolve a long running problem by changing cluster configuration.

yashodhannn
Автор

The way you are explaining complex stuff that is incredible. I am a data engineer with having more than 8 years of experience, and totally loved your content

joerokcz
Автор

Extremely helpful and great to touch different aspects of databricks.

WKhan-fhpp
Автор

Very useful one! Thanks for making this. What would also be interesting to see is, once u find cause of a performance issue via SparkUI how you go about fixing it. Like the skew issue you mentioned, how do we fix it? Maybe a video on Spark Performance Tuning ? :)

akhilannan
Автор

Excellent video! I am in the middle of optimizing the script for a client and well I have seen a lot of videos showing the UI as first thing but nobody talks exactly about how to take advantage of this resource. Thanks for sharing, and subscribing!

LuciaCasucci
Автор

Thank you - very useful as I prep for the ADV DBX DE cert!

jaimetirado
Автор

You are too good! Lot of important and tons of info. Thx for sharing!

raviv
Автор

Thank you for such as simple and powerful explanation

sergeypryvala
Автор

this video is super helpful, thank you very much! :)
I would be very interested in the topic you mentioned briefly at the beginning about JVM. Do you explain this somewhere in more detail? Also how e.g. PySpark is interacting with JVM and how Scala comes into play here?

auroraw
Автор

I love you Sir... Please keep on adding such videos

samirdesai
Автор

Thanks for introducing Ganglia, can you also make a video of how can i understand it and make more sense of the graph and data its showing please...that would be super useful

PakIslam
Автор

I was waiting for this
Finally ! Thanks 😊

the.activist.nightingale
Автор

Really really appreciate this.
I was hoping you were going end with showing how we might be able to use Ganglia to make assessments on how to choose the appropriate cluster size for a particular job

carltonpatterson
Автор

thanks for the great video! Pls do make a ganglia-focused one when you have time :)

MrDeedeeck
Автор

Thanks for this intro. Get ready Spark jobs, you're gonna be examined

GhernieM
Автор

hi! do you know why sometimes executors on executor's tab turn blue?

eduardopalmiero
Автор

how should we decide using UI, if increasing number of nodes(cores) or increasing the SKU(memory) of the Node would give me more performance benefits.. Thank you! :)

sid
Автор

Thank you very much for this uplifting video :). I was used to working with the Cloudera interface. Then, I'm wondering where the application name is. Have we lost it?

nastasiasaby
Автор

Really Wonderful stuff Simon !
Was wondering how spark /databricks handles keys, does databricks get data which already have keys from the upstream data or do you know how a dimension is created with keys being generated like a typical merge dimension procedure would do in SQL server.

skms