filmov
tv
Polars vs Pandas | detailed test with explained results
Показать описание
Polars is one of the most trending Python framework which hits Pandas in many performance tests. This video presents 8 distinct tests which demonstrates differences between Pandas and Polars in duration in seconds while running specific functions on data.
I tested the following functions by scoring these two frameworks:
- Test 1: read a single CSV file
- Test 2, and 3: select columns from a loaded dataframe (two approaches).
- Test 4: Filtering data in a dataframe.
- Test 5 and 6: Create a new column (two approaches).
- Test 7: Group and aggregate data.
- Test 8: Fill missing data.
I evaluated the competition in two groups:
1. Group where I did not used Lazy evaluation in Polars.
2. Group where I used Lazy evaluation in Polars.
From a high level perspective, Polars represents data in memory with Arrow arrays while Pandas represents data in memory in Numpy arrays. For this reason, Polars suggest Lazy functionality which makes it much faster. I mentioned it multiple times in this video (Polars has Eager and Lazy APIs, while Pandas can suggest Eager only).
The content of the whole experiment is:
0:00 - Intro
1:08 - Introducing experiment Python code
12:05 - Run the experiment
18:04 - Experiment results (summary).
21:22 - Final test results.
Additional material:
#polars #pandas #experiment
I tested the following functions by scoring these two frameworks:
- Test 1: read a single CSV file
- Test 2, and 3: select columns from a loaded dataframe (two approaches).
- Test 4: Filtering data in a dataframe.
- Test 5 and 6: Create a new column (two approaches).
- Test 7: Group and aggregate data.
- Test 8: Fill missing data.
I evaluated the competition in two groups:
1. Group where I did not used Lazy evaluation in Polars.
2. Group where I used Lazy evaluation in Polars.
From a high level perspective, Polars represents data in memory with Arrow arrays while Pandas represents data in memory in Numpy arrays. For this reason, Polars suggest Lazy functionality which makes it much faster. I mentioned it multiple times in this video (Polars has Eager and Lazy APIs, while Pandas can suggest Eager only).
The content of the whole experiment is:
0:00 - Intro
1:08 - Introducing experiment Python code
12:05 - Run the experiment
18:04 - Experiment results (summary).
21:22 - Final test results.
Additional material:
#polars #pandas #experiment
Комментарии