Apache Spark | Distinct Vs Drop Duplicates | Basic of Spark SQL | LearntoSpark

preview_player
Показать описание
In this video, we will learn about the difference between Distinct and drop duplicates in Apache Spark. We will discuss on what is the advantage on one over other. We will discuss with one example.

Blog link to learn more on Spark:

Linkedin profile:

FB page:
Рекомендации по теме
Комментарии
Автор

Very well explained, its not just about the syntax but how drop duplicates works with subset of columns.

rahuldey
Автор

Easy, crisp and very useful videos as concepts are explained through code.
Thank you Azarudeen. Hope you continue making videos with different technologies.

ritikas
Автор

Very helpful tutorials. Could you please share the notebook presented in the Tutorial if possible ?

meditation_in_nature
Автор

Could you please tell how to pass as variable in drop duplicate function. Variable might have composite keys.. Can you give me tips

maheshk
Автор

Thanks for the tutorial Azar. It really help me to understand the concepts. I am new to pyspark. When I tried to same scenario using display function. It throws me a error. Could you please explain what’s the difference between show and display

Balajionceagain
Автор

I read somewhere that dropduplicates will remove the random rows is it correct if yes is it good approach

ravikirantuduru