Method chaining in Pandas

preview_player
Показать описание
What is method chaining in Pandas? How does it work, and how can you use it?

In this video, I introduce method chaining, including the use of .loc, lambda, and assign. If you've been confused by any of these topics, you should get a better understanding from here.
Рекомендации по теме
Комментарии
Автор

I come back to this video again and again every few months to recall, It's gold powder

joaquinurruti
Автор

Instead of lambda functions, I find .query a lot more readable:

(
df
.assign(...)
.query('amount_per_mile > 5')
)

jingangmiao
Автор

Great content! This concept is especially useful for people coming to Python from R and the tidyverse syntax with the pipe operator (%>%) for chaining functions. About filtering rows how do you feel about using query() instead of loc?

VelkoKamenov
Автор

Love the way you teach it and the examples.

taqial-shamiri
Автор

total newbie to Pandas here, I like this aproach, but seems to me, where it beraks the flow, is when you need to transform the data and keep it, .... like fill the na's, delete some columns... I'll try figure it out on my own and learn.... but really like the video

VladPalacios
Автор

The one thing I felt was missing coming to python from R background! Thankyou !

vigneshkrishnan
Автор

same here. i got converted by Matt Harrison a few years back. and never looked back!

horoshuhin
Автор

Great video. I think chaining is great and I use it but sometimes gets overused which makes those codes unreadable with zillions of lambda functions.

Lnd
Автор

I love method chaining as well, but do you know if the copy_on_write changes in pandas 3.0 will make it more difficult to use? Maybe I've just misunderstood the explanation 😅

imothar
Автор

Hi
I wanna ask you two questions where can I contact You?

fnrxgpt
Автор

with all due respect to Harrison, his version of Pandas method chaining is too much overdone IMO...combined with his love for lambdas, it makes it even harder to read.

too little whitespace, too many lines of codes getting cramped together...could be just his style of writing code which is perfectly fine for his use case. If you take a look at the profile of those who complained about this style, many of them writes much more difficult languages than Python...but they complained about the readability!

pietraderdetective
Автор

Great one as always, Reuven!

Apart from the readability of method chaining, is it better to use .loc to find total_amount > 100 and passenger_count > 5?
I usually do it like this: df[(df['total_amount'] > 100) & (df['passenger_count' > 5)].
I would also do the amount_per_mile like that within [] without assign. Why shouldn't I do it that way?

RafsanSiddiqui