Spark SQL - Windowing Functions - Overview

preview_player
Показать описание
Let us understand Functions related to aggregations, ranking and windowing functions.

* We use the functions in SELECT clause.
* Specification: function() OVER (PARTITION BY column [ORDER BY
column])
* PARTITION BY is used to group the data based on a column.
* ORDER BY is used to sort the data based on a column.
* Example: rank() OVER (PARTITION BY department_id ORDER BY
salary DESC)
* Aggregations – sum, avg, min, max etc
* Ranking – rank, dense_rank, row_number etc
* Windowing – lead, lag etc
* Window have APIs such as PARTITION BY, ORDER BY
* For aggregations, we can define the group by using PARTITION
BY
* For ranking or windowing, we need to use PARTITION BY and then
ORDER BY. PARTITION BY is to group the data and ORDER BY is to
sort the data to assign rank.

Connect with me or follow me at
Рекомендации по теме
welcome to shbcf.ru