Spark Job, Stages, Tasks | Lec-11

preview_player
Показать описание
In this video I have talked about how jobs, stages and task is created in spark. If you want to optimize your process in Spark then you should have a solid understanding of this concept.

For more queries reach out to me on my below social media handle.

My Gear:-

My PC Components:-
Рекомендации по теме
Комментарии
Автор

If you don't explicitly provide a schema, Spark will read a portion of the data to infer it. This triggers a job to read data for schema inference. If you disable schema inference and provide your own schema, you can avoid the job triggered by schema inference.

roshankumargupta
Автор

In Apache Spark, the spark.read.csv() method is neither a transformation nor an action; spark.read.csv() is a method used for initiating the reading of CSV data into a Spark DataFrame, and it's part of the data loading phase in Spark's processing model. The actual reading and processing of the data occur later, driven by Spark's lazy evaluation model.

fury
Автор

bro next level ka explanation tha... thanks for sharing your great knowledge. keep up the good work. Thanks

shorakhutte
Автор

Great Manish. I am grateful to you for making such rare content with so much depth. You are doing a tremendous job by contributing towards the Community. Please keep up the good work and stay motivated. We are always here to support you.

mrinalraj
Автор

Really Awsome Explanation ! Esa Explanation kabhi or ni mil sakta hai Thank you so much

satyammeena-bukp
Автор

1 job for read,
1 job for print 1
1 job for print 2
1 job for count
1 job for collect
total 5 jobs according to me but i have not run the code not sure

KhaderAliAfghan
Автор

very good content. Please make detail videoes on spark job optimization

stuti
Автор

One of the best videos ever . Thank you for this . Really helpful.

nishasreedharan
Автор

One question:

after groupby by default 200 partitions will be created where each partition will hold data for individual key.

What happens if there are less keys like 100, will it lead to formation of only 100 partition insted of 200?

AND

What happens if the individual keys are more than 200 in number, will it create more than 200 partitions?

Food_panda-husj
Автор

Wow Kya clear explanation tha, first time understood in.one.go

naturehealingandpeace
Автор

Explained so well that too bit- by- bit 👏🏻

ShrinchaRani
Автор

I have one doubt there are 3 actions are there such as read, collect and count, but why it is creating 2 job only ?

SuresgjmJ
Автор

Hi bhaiya. Why havent we considered collect() as a job creator here in the program you discussed?

kumarankit
Автор

aapne video me 4 stage bnaya but spark ui me 3 hi stage kaise bnane, read me hi toh ek stage bna th

rimilog
Автор

Ek question tha ki, order kya hona chahaye likhne ka, Mtlb ki ager hum filter/select/partition/group by/distinct/count ya or kuch bhi ker rahe hai to, sabsay pehla kya likhna chahaye…

ChetanSharma-oyge
Автор

Start a playlist with guided projects, ,so that we can apply these things in real life..

asif
Автор

Hi Manish,
Count() is also an action right ? If not can you please explain what is count()

VenkataJaswanthParla
Автор

nice explain, each and every concept you clear keep it up

rohitbhawle
Автор

I really liked this video....nobody explained at this level

Useracwqbrazy
Автор

Didn't find such a detailed explanation, Kudos

rahuljain