Using groupBy with where conditions in PySpark | Realtime Scenario| Spark Interview Questions

preview_player
Показать описание
Hi Friends,

Sample code is checked into github:

In this video, I have explained sample PySpark code to perform group by operation with where conditions.

Please subscribe to my channel for more interesting learnings.
Рекомендации по теме
Комментарии
Автор

Hi,
I had interview question like,
Need to count the no of flights,
Chennai to Bangalore or Bangalore to Chennai are in same one count..
How to get that..

While use partitionBy or groupBy its giving chennai to Bangalore seperate count and Bangalore to Chennai is giving seperate count.

Can give idea?

nwugkkd
Автор

I have one doubt y here used toDF and mentioned columns. Actually when rdd converted to dataframe at that we can use toDF. Could you please clarify my doubts

sravankumar