Visualizing SQL Pattern Matching (MATCH_RECOGNIZE) - A Beginner's Guide

preview_player
Показать описание
Added in 12c, SQL pattern matching make it easy to search for sequences of different events across rows.

But how does it work?

Watch this presentation for a complete beginner's guide to match_recognize. With visualizations to show how it selects the rows, this session will teach you how to use this powerful clause.

Highlights include:

02:40 - An overview of match_recognize & its clauses
08:40 - Demo of using pattern matching with one variable to find consecutive 1km runs
17:00 - Comparing results when searching for different patterns of 1km runs
27:15 - Demo searching for two variables - 1km run followed by a 5km run
35:00 - Displaying overlapping patterns in the input
39:00 - Does pattern matching replace the where clause?
42:30 - Comparing different patterns outputs when using two variables
44:30 - Classifying rows into value ranges - the limitations of regular expressions
48:30 - Demo using pattern matching to group rows into range values and calculate totals
58:10 - Why use pattern matching - example use cases

Further reading:

Рекомендации по теме
Комментарии
Автор

Wow! The first 25 minutes you killed it with the explanation for the beginners and the rest rocks. But funny thing is I'm here for Snowflake but Oracle and Snowflake both has same syntax ( almost). Thank you so much

santhoshreddykesavareddy
Автор

Hi Chris, at time 17:06 that you put different conditions side by side, on the second example, NUM column is listed as 1 for "order by run_date ... pattern(one_km+)" but it must be 3 for rows 7, 10, 11 of march since three rows are consecutive and they matches to pattern "one_km+".

mustafakalayciDBA
Автор

To create / load the data mentioned in the video.

create table run (run_date date, time_in_s number, distance_in_km number);

insert into run values ('01-MAR-2023', 310, 1);
insert into run values ('02-MAR-2023', 1700, 5);
insert into run values ('03-MAR-2023', 319, 1);
insert into run values ('06-MAR-2023', 1572, 5);
insert into run values ('07-MAR-2023', 280, 1);
insert into run values ('10-MAR-2023', 287, 1);
insert into run values ('11-MAR-2023', 242, 1);
insert into run values ('13-MAR-2023', 1525, 5);
commit;

-- 3 or more 1 km runs.
select *
from run
match_recognize (
order by run_date
all rows per match
pattern (one_km{3, })
define
one_km as distance_in_km = 1
);

-- 1 km run followed by a 5 km run.
select *
from run
match_recognize (
order by run_date
all rows per match
pattern (one_km five_km)
define
one_km as distance_in_km = 1,
five_km as distance_in_km = 5
);

-- With the additional columns.
select *
from run
match_recognize (
order by run_date
measures
classifier () as var,
match_number() as grp,
final count(*) as num
all rows per match
pattern (one_km five_km)
define
one_km as distance_in_km = 1,
five_km as distance_in_km = 5
);

Автор

Hi again Chris, at time 43:25, at side by side comparison slide, second and third column output seems incorrect to me. "after match skip to next row" will start next pattern search at "next row of the previous matches first row" so, on second example, in the output after last row (13th of march) there should be a new matched rows group that start with row 10th of march. likewise on the third example (one_km five_km one_km pattern) there should be a second group starting with 3th of march because skip to next row will start search from the row of 2th of march. Am I wrong? outputs in the slides are look like an example of "after match skip past last row".

mustafakalayciDBA
Автор

The quality of the audio is abysmal - like I'm at the conference call by phone in the 90s in the middle of nowhere...
Oracle - multibillion dollar company - gets its billions by saving 50-100-200$ for a decent microphone?
Any homegrown streamer has audio quality 10 times better than that nowadays...

TaranovskiAlex
visit shbcf.ru