How to use Ranking Functions in Apache Spark | RANK | DENSE_RANK | ROW_NUMBER

preview_player
Показать описание
In this video , I will tell you how to use Ranking Functions in Apache Spark. We will talk about below functions:

TIMESTAMPS
0:0 What is RANK() function.
3:34 What is DENSE_RANK() function.
5:27 What is ROW_NUMBER() function.

RANK() Funtion :
All of the ranking functions depend on the sort ordering specified by the ORDER BY clause of the associated window definition.
Rows that are not distinct in the ordering are called peers. The ranking functions are defined so that they give the same rank to any two peer rows with the next ranking(s) skipped. Thats is if we have 2 items at rank 2, the next rank listed will be rank 3.

DENSE_RANK() function.
The DENSE_RANK function is similar to RANK function however the DENSE_RANK function does not skip any ranks if there is a tie between the ranks of the preceding records.

ROW_NUMBER() function.
The ROW_NUMBER window function determines the ordinal number of the current row within its partition. The ORDER BY expression in the OVER clause determines the number. Each value is ordered within its partition.

Download the sample data from our Github repository.

🔵 COMPLETE APACHE SPARK TUTORIAL PLAYLIST 🔵

🔵 WORKING WITH STRUCTURED DATA IN APACHE SPARK 🔵

🔵 WORKING WITH DATE COLUMNS IN APACHE SPARK 🔵

🔵 WORKING WITH WINDOWING, AGGREGATE FUNCTIONS IN APACHE SPARK 🔵
Рекомендации по теме