Apache Spark | Spark Scenario Based Question | Data Skewed or Not ? | Count of Each Partition in DF

preview_player
Показать описание
Apache Spark | Spark Scenario Based Questions | Data Skewed or Not ? | Count of Each Partition in DF
In this video, we will see how to find the count of each partition in spark dataframe. This Scenario based question is helpful in investigating the Data Skewness is there or not.

DataSet:

Blog link to learn more on Spark:

Linkedin profile:

FB page:

#apachespark #spark #bigdata
Рекомендации по теме
Комментарии
Автор

excellent series. Thank you very much.

sanyoge
Автор

very nice . Great programmatic way to find data is skewed or not

dattaningole
Автор

very nice and concise video. Do you have the video where you show how to resolve the skew as you mentioned at the end of the video ?

rishigc
Автор




Answer is 1 at databricks cluster

It should be 200 as given in this video or 4.

trilokinathji
Автор

Good Content, Explained Well .. Thanks Much ..
Please post the continuation .. .

Gamer_Dooby
Автор

Hi bro, can you explain dynamic memory allocation in spark submit command

sasim
Автор

Clearly demonstrate how to identify / detect column data skewnewss

cswanghan
Автор

Nice explanation of the data skewness.Could you please explain how can we achieve the same using scala.

priyankas
Автор

When can I expect the video to be uploaded for resolving the issue for data skewness.. waiting

ashwinc
Автор

Great Session. Could you please share notebook or link to learntospark

nikhilmeghnani
Автор

We can check this by group by then why we using partition? I am new in spark plz explain

rupeshdeoria
Автор

I have been asked the same question 😊 How you find out there is data skew problem is there and I have asked this question to lots of people nobody able to answer it. I have one supplement question- let’s say if I am not using partition by to that DF will data skew problem arise and if yes then How we will find out.

soumyakantarath
Автор

where is the video telling how to handle the skewness?

asyakatanani
Автор

Do have sample code for this Kindly share.

subramanyamsibbala
Автор

hi sir i am not understand it show 200 partition and you say it 4 partition while we not do repartition(4) show how it in 4 partition.

rupeshdeoria