Apache Spark Architecture : Run Time Architecture of Spark Application

Показать описание

In this video, you will get to know the nuts and bolts of apache spark architecture. You will also understand how a spark application runs when a spark job is submitted.

Run time Architecture of spark application
-------------------------------------------------------------------

Apache Spark uses master/slave architecture. The client submits an spark user application code.

When an application code is submitted, the driver implicitly converts user code containing transformations & actions to logical directed acyclic graph (DAG). At this stage, it also performs optimizations , such as pipelining transformations. Then it converts the logical graph (DAG) into physical execution plan with set of stages. After converting into physical execution plan, it creates physical execution units called tasks under each stage. Then the tasks are bundled to be sent to the cluster.

Now the driver talks to cluster manager and negotiates for resources. Cluster manager launches executors in worker nodes, on behalf of driver.

At this point, the driver will send the tasks to the executors based on data placement. When executors start, they register themselves with the driver, so driver will have complete view of all executors. Executors start executing the tasks assigned by the driver program. At any point of time, when the application is running, driver program will monitor the set of executors that runs.

Driver will also schedule future tasks in appropriate location,
based on data placement.

The user program may cache data in certain locations (using cache method or persist method). Driver tracks the location of cached data and uses it to schedule future tasks that access that data.

In next video, we will learn about RDD, which is the Spark's core system

Рекомендации по теме

Комментарии

The series are just great~ thank you, keep up the nice work!

ccjoseph

You are simply superb. Trust me you are doing a fabulous job. Lot's of love, please post more videos on Spark

arunrdy

Wow. Thank you for your hug effort to explain complex flow in easy manner

saravanannagarajan

Awesome explanation, very simple and interest learning model

anilgajwel

Awesome presentation, Thank you. Please keep posting the videos.

professoruma

Awesome...awaiting more video on the same topic

shampabhattacharya

Excellent..The way explaining is very good and understandble.

Thank you so much.

raviteja

For a short and crisp understanding you can also refer to this post:

nextdoorIntelligence

Apache Spark Architecture : Run Time Architecture of Spark Application

Apache Spark Architecture : Run Time Architecture of Spark Application

Apache Spark Architecture | Spark Cluster Architecture Explained | Spark Training | Edureka

Spark architecture explained!!🔥

Apache Spark Architecture with Spark Context Driver and Executor

Spark Architecture in 3 minutes| Spark components | How spark works

Learn Apache Spark in 10 Minutes | Step by Step Guide

Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spark Tutorial |Simplilearn

What Is Apache Spark?

Building Real-Time Data Pipelines: A Practical Guide - Data Engineering Process Fundamentals

Spark Executor Core & Memory Explained

Understanding Databricks & Apache Spark Performance Tuning: Lesson 01 - Spark Architecture

Hybrid Apache Spark Architecture with YARN and Kubernetes

Spark Architecture | Spark Interview Questions

Apache Spark Architecture in less than 10 Minutes

Apache Spark Internal architecture jobs stages and tasks

Apache Spark Architecture | Apache Spark for Beginners using Python | DM | DataMaking | Data Making

Spark Standalone Architecture

What exactly is Apache Spark? | Big Data Tools

Spark Architecture Part 4 : Spark job to stage and stage to task spark job spark stages, spark tasks

Apache Spark - 03 - Architecture - Part 1

Hadoop In 5 Minutes | What Is Hadoop? | Introduction To Hadoop | Hadoop Explained |Simplilearn

What is Apache Spark in less than 10 minutes | An Introduction to Apache Spark architecture

Making Apache Spark™ Better with Delta Lake

What is Spark, RDD, DataFrames, Spark Vs Hadoop? Spark Architecture, Lifecycle with simple Example