02 How Spark Works - Driver & Executors | How Spark divide Job in Stages | What is Shuffle in Spark

preview_player
Показать описание
Video explains - How Spark works ? What are Driver and Executors ? How Spark divides JOB in Stages and Tasks ? How Spark Processes data in parallel ? What are Cores and Tasks ? What is data Shuffle ?

Chapters
00:00 - Introduction
00:25 - How Spark Works ?
01:47 - How Spark divide Job in Stages ?
02:06 - What is Shuffle ?
03:01 - What is Driver ?
03:30 - What are Executors ?
04:02 - Understand complete Workflow

The series provides a step-by-step guide to learning PySpark, a popular open-source distributed computing framework that is used for big data processing.

New video in every 3 days ❤️

#spark #pyspark #python #dataengineering
Рекомендации по теме
Комментарии
Автор

concise and simple and made very to understand! going through this playlist! please keep it coming with new videos

varun_rag
Автор

What a Awesome Explanation 👍. You are a GENIUS 🙏
Most youtubers have taken 45 mins to 1 hour to explain this in a very complex way. You explained it so simple & clear way in just 4 mins 😊
Subscribed your channel.

RameshKumar-ngnf
Автор

Very easy way to teach pyspark. You are amazing.

Kind_king_
Автор

Great explanation.. liked the illustration that 2 counts happened and the fact that after local count and before global count, some shuffling happened

vaibhavkumar
Автор

Again from video itself: executors are jvm processes, 1 core can do 1 task at a time, above pic we have 6 cores, so 6 tasks were possible

vaibhavkumar
Автор

Thank you for the wonderful video. I have a query @ 1:22 --- When we do perform collect or some other action, the tasks then send their status information directly to the driver and then the driver consolidate and send it back to the user the final count? Or the another task will perform it and send it to the driver? Could you please clarify a bit?

connectwithanandsuresh
Автор

Can we say if 6 tasks are there then there will be 6 cores?

shravanisharma
Автор

you earned a sub man, great work. Keep going and thanks for the content. It is much useful

sakthi
Автор

Great work keep doing more videos on Data engineering

Gorrepatisreya
Автор

Thank you for crystal clear explanation

jay-jxh
Автор

Keep up the good work!! Subscribed. Also could you create an interview series for data engineers?

harshajyotidas
Автор

Shuffle is the boundary which divides job into stages

vaibhavkumar
Автор

@easewithdata can we use the above example to explain the interviewer, when the interviewer asks to explain about the Spark Architecture

prasanthmaddiboina
Автор

can you please elaborate relationship between executors and JVM (1 to 1 or 1 to M) and both are same or different?

alishmanvar
Автор

Hi really helpful videos.
can you please share the soft copy of the presentation?

abhishekadivarekar
Автор

Can you please provide the ppt used in the video as well?

zaidkhalid
Автор

Thanks for the fantastic explanation 👍🏻

cba
Автор

between executors and driver there is cluster manager

AmineLambaouak
Автор

What is Group...What is Job...In you diagram ?

sriyell
welcome to shbcf.ru