Make Your Pandas Code Lightning Fast

preview_player
Показать описание
Speed up slow pandas/python code by 2500x using this simple trick. Face it, your pandas code is slow. Learn how to speed it up! In this video Rob discusses a key trick to making your code faster! Pandas is an essential tool for any python programmer and data scientist. Using the pandas apply function, using vectorized functions, the speed difference can be significant. Write faster python code.

Timeline
00:00 Intro
00:46 Creating our Data
02:39 The Problem
03:48 Coding Up the Problem
04:43 Level 1: Loop
06:29 Level 2: Apply
07:27 Level 3: Vectorized
09:31 Plot The Speed Comparison
10:23 Outro

#python #code #datascience #pandas
Рекомендации по теме
Комментарии
Автор

Whoa.. 3500 times difference. Vectorised is even faster than apply, will give it try next time for sure. Awesome video as always.

hasijasanskar
Автор

also, a way to speed it up is to not use & and | for 'and' and 'or' but just use the words 'and' and 'or'. these words are made for boolean expressions and thus work faster. & and | are bitwise operators and are made for integers. using these will force python to make the booleans an integer and then do the bitwise operation and then cast it back to a boolean. this doesn't take that much time if u do it once but in a test scenario inspired by this video it was roughly 45% slower.

kip
Автор

That is unbelievable. Astounding time difference. I was recently watching a presentation on candle stick algorythm, and the presenter used vectorised method and I was confused (I an new to Python), but this video made it all too clear. Fantastic presentation.

Zenoandturtle
Автор

As I work with Pandas and large datasets, I come across code that use iterrows often. Most developers just don't care about time or come from various programming backgrounds that prohibit them from using efficient methods. I wish more people use vectorization.

deepakramani
Автор

This is spot on. I had a filter running that was going to take 2 days to complete on a 12M line CSV file using iteration - clearly not good. Now it takes 6 seconds.

nathanielbonini
Автор

I didn’t realize you could write 10k as 10_000. I work with astronomical units so makes variables more readable. Great video!

jti
Автор

Haha coincidentally I'd been raving about vectorized to my friends the last few months. It's soo good. The moment I saw your title I figured you're probably talking about vectorize too haha. Awesome video and great content!!

LimitlesslyUnlimited
Автор

Man where have you been all my Python-Life!?!? Thank you so much for this! Outstanding!!!

robertjordan
Автор

My man made a df out of the time diff to plot them!! Really useful video. Will definitely keep this in mind from now.

nirbhay_raghav
Автор

Hey man, nice video! Kudos from reddit!

alexandremachado
Автор

That's another awesome video....extremely useful in the real world work. Thanks again Rob

FilippoGronchi
Автор

Thanks for the great video! I have a project with some calculations. They take some minutes through the loops. I'm going to use vectorized way. So i'll write another comment with comparison later. Some days later... i rewrote a signifacnt part of my code. Made it vectorized, and i got fantastic results. The example: old code - 1m.3s, new code - 6s. One more: old code - 14m.58s, new code - 11s. Awesome!

i-Mik
Автор

I'm over here as a newbie data scientist, copying the logic step-by-step in order to have good coding habits in the future lmao. Thanks for the video, really valuable!

OktatOnline
Автор

Hey, I just thought I'd mention, I really appreciate that you use really huge test datasets, since a lot of the time, test datasets used in tutorials are quite small and don't sure how code will scale. This video does it perfectly, though!

craftydoeseverything
Автор

Great video! I wish I had known not to loop over my array for my machine learning project... going to go improve my code now!

gabriel-mckee
Автор

Somehow i have been met vectorize method first at the beginning on my python and pandas journey. Thanks for sharing your experience, lightning fast

OPPACHblu_channel
Автор

Wow amazing. Please keep making more videos like this.

hussamcheema
Автор

Wow ! That's an excellent way of speed up the code.

anoopbhagat
Автор

Awesome video man! Appreciate the tips, I'll definitely be subscribing!

Vonbucko
Автор

There are several videos on pandas vectorization. This is the best.

colmduffy