pandas best practices (3/10): Comparing groups

preview_player
Показать описание
This video covers the following topics: filtering a DataFrame, value counts, normalization, groupby.

NEW TO PANDAS? Watch my introductory series (30+ videos):

DOWNLOAD the dataset and notebook:

SUBSCRIBE to learn data science with Python:

JOIN the "Data School Insiders" community and receive exclusive rewards:

LET'S CONNECT!
Рекомендации по теме
Комментарии
Автор

These videos are absolutely fantastic. Everything is explained in a way that it extendable to countless other problems. Hate to be that guy ...but I have to jump on the bandwagon and mention that I hope you keep more videos coming! Even the way the video series is titled has helped me organize in my head how I think about approaching a problem using pandas. Thank you so much for the time you put into this.

kevinz
Автор

super helpful Kevin, always love watching your tutorials, keep them coming!

brentskoumal
Автор

Thank you Kevin! All your videos are super helpful👍

Kristina_Tsoy
Автор

Thank you. Good lecture, best delivery

trungthanh
Автор

Love the unstack at the end! That is one of my favorite commands.

WaylonWalker
Автор

Nice videos!!
Is there any way to show at the same time the normalize and the absolute result?

joseluisvallespardo
Автор

when i put this: ri[ri.violation == 'speeding'].driver_gender.value_counts(), out is:Series([], Name: driver_gender, dtype: int64)

crigar
Автор

Another way to do the same, hope it is pythonicly correct:
pd.crosstab(ri.driver_gender, ri.violation).Speeding

soobrazil
Автор

I was thinking to ri[(ri.driver_gender=='M') & (ri.violation=='Speeding')]. But is that passaible to become a number data instead of DateFrame by adding some method ?

alexlfo
Автор


and its output is empty column..what could be the problem

akshayakn
Автор

Grouping by gender will do the trick too

sarikadatta
Автор

If we want to drop columns having more then 30% of NA (or more then 100 NA)

lenahudson