Spark Executor Core & Memory Explained

preview_player
Показать описание
Spark Executor Core & Memory Explained

#apachespark #bigdata #apachespark



Video Playlist
-----------------------

YouTube channel link

Website
Technology in Tamil & English
Рекомендации по теме
Комментарии
Автор

Thank much it was really a simple and best explanation for those configs.

rajasaroj
Автор

Thank you very much for the detailed explanation and it gave very good understanding on how these properties help in running the spark job. Really appreciate your help in educating the tech community 👏👏

RP-sxkf
Автор

Very useful video Anna. Thanks Much! Anna requesting to please make a video on the Real-Time project which is done in Industries as one video. Similarly, as a continuation make another video on, "what sort of question we get on that same real-time project in real-time interviews. Please Please Anna Please make a video on this. Thanks in advance.

akberj
Автор

I have a 250gb file to process and I used dynamic allocation. when I try to run the job it is giving an error job got aborted due to stage failure. how do I fix this issue?

mohans
Автор

Hi! great content! i'm wondering how yarn container vpu mem size works with executors.

CrashLaker
Автор

you have a great teaching skill. Kudos!

RohitSaini
Автор

Hi, is it possible to create multiple executors on my personal laptop having 6 cores and 16 gb RAM?

mahak
Автор

Do executors themselves run in parallel in Spark, or is it just the tasks within them?

souhailaakrikez
Автор

Can we use sparksession on worker node. Facing issue with accessing spark session on worker nodes. Pls hp

thenaughtyanil
Автор

Can we say that cores are actual available threads in spark,
As core can run multiple tasks .
So its not always one core for one task.
A core can multitask.
Can you confirm this?

swapnilpatil
Автор

What configuration will required for 250GB data?

AkshayKangude-nwxh
Автор

If no. of cores are 5 per executor,
At shuffle time, by default it creates 200 partitions, how that 200 partitions will be created, if no of cores are less, because 1 partition will be stored on 1 core.

Suppose, that
My config is, 2 executor each with 5 core.
Now, how it will create 200 partitions if I do a group by operation?
There are 10 cores, and 200 partitions are required to store them, right?
How is that possible?

Amarjeet-fblk
Автор

i have applay 4x memory in each core for 5Gb file but no luck can you please help me to how to resolve this issue


Road map:
1)Find the number of partition -->5GB(10240mb)/128mb=40
2)find the CPU cores for maximum parallelism -->40 cores for partition
3)find the maximum allowed CPU cores for each executor -->5 cores per executor for Yarn
4)number of executors=total cores/executor cores -> 40/5=8 executors

Amount of memory is required

Road map:
1)Find the partition size -> by default size is 128mb
2)assign a minimum of 4x memory for each core -> what is applay
3)multiple it by executor cores to get executor memory ->????

ultimo
Автор

Can you explain why does Spark spill to disk and what cause this? I understand that in wide transformation or groupbykey statement where data is too big to fit in memory then spark has no choice but to spill it to disk ; my question is if we can minimize this with any performance tuning like bucketing/mapside join, etc...

ardavanmoinzadeh
Автор

Please can you explain this video in Tamil. It will be very helpful for me. Thank you

ThePrasanna
Автор

bro please we want projects on big data

NAJEEB