Spark Interview Question | Handle Data Skewness in Apache Spark | LearntoSpark

Показать описание

In this video, we discuss about the skew-ness issue in spark and ways to over come this issue in Spark.

Blog link to learn more on Spark:

Linkedin profile:

FB page:

Github:

Рекомендации по теме

Комментарии

Thanks Azarudeen, How can we do salting technique in Pyspark for data skewness?

vijeandran

Thanks for your explanation. Can you please provide this with example. I mean code and data.

krishnarupeshnagubandi

Salting technique we can apply to avoid the skewness when we know data skew before we running the job.But in real time if job is failed due to skew how to handle it? and how to avoid such skew failures in production?

ramesh-bigdatalearner

Thanks for your efforts in making videos on spark. It would be helpful if you show sample code for these options for better understanding

srinuch

Hi Bro... This is Ultimate... Very well explained...
I am a begginer... So I couldn't understand...
1.. What is single dominant partition id mean...? Why partitions with single dominant partition id cannot be repartitioned...
2. Also, what is multiple dominant partition id...
3. How to change in the source itself... Partitions are made by spark - based on partitions size, right... How can we specify one or more columns for partitioning... What is the command for that...
Please help me in these doubts...
Thank you, bro...

gurumoorthysivakolunthu

please explain us the salting technique in detail

monangiavinash

Please explain solution with a real timecode scenario

ashwinc

Hi Shahul i have sent the screenshots as per your description

awanishkumar

Can you pls add English subtitles as well to all your videos

swagatikaskavita-ekkoshish

Spark Interview Question | Handle Data Skewness in Apache Spark | LearntoSpark

Spark Interview Question | Handle Data Skewness in Apache Spark | LearntoSpark

Spark Interview Question | How many CPU Cores | How many executors | How much executor memory

Apache Spark Interview Questions And Answers | Apache Spark Interview Questions 2020 | Simplilearn

Spark memory allocation and reading large files| Spark Interview Questions

NULL Values in Spark ☹️| A Common mistake ❌ | Spark Interview Question

Spark Interview Question | Scenario Based Question | Multi Delimiter | LearntoSpark

Spark Interview Question | Scenario Based | Multi Delimiter | Using Spark with Scala | LearntoSpark

Spark Scenario Based Question | Handle JSON in Apache Spark | Using PySpark | LearntoSpark

Spark Interview Questions Preparation Course

Data engineer interview question | Process 100 GB of data in Spark Spark | Number of Executors

Spark Interview Question | Scenario Based Question | Explode and Posexplode in Spark | LearntoSpark

Null Data in Spark | How to Handle Null Type Data | Remove Null from Spark | Spark Interview

Processing 25GB of data in Spark | How many Executors and how much Memory per Executor is required.

Spark Interview Question | Scenario Based |DataFrameReader - Handle Corrupt Record | LearntoSpark

Batch vs Stream processing | Spark Interview Questions

Spark Out of Memory Issue | Spark Memory Tuning | Spark Memory Management | Part 1

Spark performance optimization Part1 | How to do performance optimization in spark

Spark Interview Question | Online Assessment Question | Coding Round | Spark Scala | LearntoSpark

Spark Interview Question | Scenario Based | Merge DataFrame in Spark | LearntoSpark

Managing Spark Partitions | Spark Tutorial | Spark Interview Question

Handling Non splittable Files | Spark questions from comments

Apache Spark interview questions & Points to remember-Part 1 | Session-19

Spark Scenario Based Question | Handle Nested JSON in Spark | Using Spark with Scala | LearntoSpark

2.2 Fault Tolerance in Spark | Spark Interview question #spark #bigdata #hadoop