Identifying duplicates in a data frame | Pyspark

preview_player
Показать описание
#pyspark #bigdata #leetcode #sql #python #coding #bigdata #mysql #interviewquestion #dataengineer #jupyternotebook #datascience
Рекомендации по теме
Комментарии
Автор

could you do a video on how to ID the duplicates and then based on the rank, flag the duplicates or put them into a new dataframe? Would it also be possible to merge the duplicates data if needed? Thank you for the great video. I predict your content will grow in users so its really cool that I can be here early.

fatgezimbela