Apache Spark Executor Tuning | Executor Cores & Memory

preview_player
Показать описание
Welcome back to our comprehensive series on Apache Spark Performance Tuning & Optimisation! In this guide, we dive deep into the art of executor tuning in Apache Spark to ensure your data engineering tasks run efficiently.

🔹 What is inside:
Learn how to properly allocate CPU and memory resources to your Spark executors and the number of executors to create to achieve optimal performance. Whether you're new to Apache Spark or an experienced data engineer looking to refine your Spark jobs, this video provides valuable insights into configuring the number of executors, memory, and cores for peak performance. I’ve covered everything from understanding the basic structure of Spark executors within a cluster, to advanced strategies for sizing executors optimally, including detailed examples and calculations.

📘 Resources:

Chapters:
0:00 - Introduction to Executor Tuning in Apache Spark
0:37 - Understanding Executors in a Spark Cluster
3:30 - Example: Sizing Executors in a Cluster
4:58 - Example: Sizing a Fat Executor
9:34 - Example: Sizing a Thin Executor
12:50 - Advantages and Disadvantages of Fat Executor
18:25 - Advantages and Disadvantages of Thin Executor
22:12 - Rules for sizing an Optimal Executor
26:30 - Example 1: Sizing an Optimal Executor
38:15 - Example 2: Sizing an Optimal Executor
43:50 - Key Takeaways

#ApacheSparkTutorial #SparkPerformanceTuning #ApacheSparkPython #LearnApacheSpark #SparkInterviewQuestions #ApacheSparkCourse #PerformanceTuningInPySpark #ApacheSparkPerformanceOptimization #ApacheSpark #DataEngineering #SparkTuning #PythonSpark #ExecutorTuning #SparkOptimization #DataProcessing #pyspark #databricks
Рекомендации по теме
Комментарии
Автор

Every-time I come here before attending an interview, I try to give this video a like, but end up realising that I already did it earlier. Best video on this topic on whole internet.

dudechany
Автор

Bro do the videos regularly on spark it will be very helpful. Thank you

bijjigirisupraja
Автор

All the concepts are clearly explained. Please do more videos.

deepikas
Автор

you are a lifesaver bro i have learnt this concept so many time but i didn't get it very well but in your video i totally understand how to calculate the resource for spark job

NaveenKumar-fmyg
Автор

Man your tutorials are the best. I have been following you for Spark turning related videos. Thanks

BabaiChakraborty-sspt
Автор

This is awesome stuff..The executor Tuning concept is explained at a very granular level.

SandeepPatel-wtye
Автор

Really waiting to see if you can add some real world use cases to your videos to strengthen our understanding. It will be appreciated a lot man!

mohitupadhayay
Автор

Amazing is the word you never dissapoint us . very greatful and indebted to you for this excellent content you are creating. God bless you !

sankarshkadambari
Автор

Wow !! Great Content !! I am preparing for interviews and found this super helpful. Thanks a Ton !!

mayapareek
Автор

Thank you for the useful content! IRL an analyst / engineer would have access to a huge cluster which is shared between many people / teams. It would be very interesting to watch a video where you calculate the amount of resources that should be requested based on the task at hand (particular dataset, task and output). And again - thanks for helping to understand these somewhat hard to grasp concepts :-)

leilaturgarayeva
Автор

Thankyou so much for wonderful content. please start PySpark session

adtempgupta
Автор

Great work, going good. I hope you cover 2 more topic of driver oom and executor oom. Why it happens and how we can tackle it.

AshishStudyDE
Автор

At 35:10 @afaqueahmad7117 I want to add one point. You said that executions happen in execution memory, that is 60 % percent, and 40 percent is user memory. So . 60 Percent of 20GB -> is 12 GB memory. Out of which 50 percent is for execution and 50 percent for storage. Let's assume 50 percent is given to execution(static allocation). Out of 12 GB, only 6 GB is for execution. As we have 5 cores per executor. Therefore 6/5 === approximately 1.2 per portion of memory per core. The maximum partition size that can be accommodated is 1.2 GB of partition. My thought process is correct ????

remedyiq
Автор

Hell Afaque, your tutorials are excellent and I learnt so much about optimization techniques. I am wondering if you can add some real world use cases to your videos to strengthen our understanding. It will be appreciated a lot.

purnimasharma
Автор

Thanks Afaque for this great tutorial. This will really help while working on Spark Optimization. It would be of great help if you can tell how do you deal with this type of questions: -
spark cluster size -- 200 cores and 100 gb RAM
data to be processed --100 gb
give the calculation of spark for driver memory, driver cores, executor memory, overhead memory, number of executors

yatinchadha
Автор

Thanks for this currently working on job optimization it is very useful to me

iamexplorer
Автор

Hey man. learnt a lot from the video. please help me out on this doubt
for example 2, total executors = 44/4 = 11 you have said. But shouldn't we think machine by machine, here each machine can have, 15/4 === 3 executors if 4 core for each, giving total 3*3 nodes = 9. in your workout, it seems like there will be an executor which will use some cores from one node and some from other. Am I wrong in my thought process somewhere?

ashutoshpatkar
Автор

Thank you very much for this amazing content with super easy explanation 👏👏

seenu
Автор

Thanks for this videos.
I have been watching your videos from quite a while.

You explain things in a very easy and simple manner.

But,
I thinks in real time we would be processing a very large amount of data,
So, It will be great if you can make a video ön processing large amounts of data with all the optimisation techniques we can use.
Thanks in advance.

Amarjeet-fblk
Автор

Hi Afaque, it is was a really nice video. Never got such detailed understanding anywhere. Do you also provide 1:1 session? If yes, I am highly interested.

yashwantdhole