Spark Interview Question | Handle Data Skewness in Apache Spark | LearntoSpark

preview_player
Показать описание
In this video, we discuss about the skew-ness issue in spark and ways to over come this issue in Spark.

Blog link to learn more on Spark:

Linkedin profile:

FB page:

Github:
Рекомендации по теме
Комментарии
Автор

Thanks Azarudeen, How can we do salting technique in Pyspark for data skewness?

vijeandran
Автор

Thanks for your explanation. Can you please provide this with example. I mean code and data.

krishnarupeshnagubandi
Автор

Salting technique we can apply to avoid the skewness when we know data skew before we running the job.But in real time if job is failed due to skew how to handle it? and how to avoid such skew failures in production?

ramesh-bigdatalearner
Автор

Thanks for your efforts in making videos on spark. It would be helpful if you show sample code for these options for better understanding

srinuch
Автор

Hi Bro... This is Ultimate... Very well explained...
I am a begginer... So I couldn't understand...
1.. What is single dominant partition id mean...? Why partitions with single dominant partition id cannot be repartitioned...
2. Also, what is multiple dominant partition id...
3. How to change in the source itself... Partitions are made by spark - based on partitions size, right... How can we specify one or more columns for partitioning... What is the command for that...
Please help me in these doubts...
Thank you, bro...

gurumoorthysivakolunthu
Автор

please explain us the salting technique in detail

monangiavinash
Автор

Please explain solution with a real timecode scenario

ashwinc
Автор

Hi Shahul i have sent the screenshots as per your description

awanishkumar
Автор

Can you pls add English subtitles as well to all your videos

swagatikaskavita-ekkoshish