Session 19 - GroupBy Object in Pandas | Data Science Mentorship Program (DSMP) 2022-23

preview_player
Показать описание
Data Science Mentorship Program (DSMP) 2022-23

-------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
Datasets used in the session -

-------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------
| Time stamp |
-----------------------
0:00 Start
2:44 Website Update
5:33 Session Start
8:15 GroupBy
14:20 # Applying builtin aggregation fuctions on groupby objects

21:02 # find the top 3 genres by total earning
25:49 # find the genre with highest avg IMDB rating
27:23 # find director with most popularity
29:32 # find the highest rated movie of each genre
31:49 # find number of movies done by each actor
35:00 Doubts

# GroupBy Attributes and Methods
37:34 # find total number of groups -- len
39:01 # find items in each group -- size
40:36 # first()/last() / nth item
43:04 # get_group / vs filtering
45:35 # groups attribute
47:20 # describe / # sample / # nunique
52:55 Doubt Clearance

55:17 # agg method - passing dict
59:28 # agg method - passing list
1:02:26 Doubts

1:03:00 # looping on groups
1:07:24 # find the highest rated movie of each genre
1:11:20 Doubts

1:13:04 # apply -- builtin function
1:15:37 # find number of movies starting with A for each group
1:21:40 # find ranking of each movie in the group according to IMDB score
1:25:38 # find normalized IMDB rating group wise
1:30:57 Doubts

1:32:45 # groupby on multiple cols
1:35:47 # find the most earning actor -- director combo
1:37:26 # find the best(in-terms of metascore(avg)) actor -- genre combo
1:40:01 # agg on multiple groupby
1:42:17 Doubts

1:43:52 IPL Dataset
1:46:30 # find the top 10 batsman in terms of runs
1:49:42 # find the batsman with max no of sixes
1:52:30 # find batsman with most number of 4's and 6's in last 5 overs
1:56:12 Doubts
1:57:09 # find V Kohli's record against all teams
2:00:40 # Create a function that can return the highest score of any batsman

2:05:00 Doubts
Рекомендации по теме
Комментарии
Автор

Thanks sir for giving a best contant❤❤❤❤

ayushmansingh
Автор

Groupby works with any data type.
Categorical columns provide significant performance and memory advantages for groupby operations.

TheKumarAshwin
Автор

22:53 If you can see closely to your sorted data, you can see the sort value methods is not working, hence this are not top genres

TheKumarAshwin
Автор

Plz provide me solutiion of this seesion task plz

Mtashqain
Автор

18:44 Hi nitish correct me if im wrong here.. film 300 director is Zack Snyder or Abhishek Chubey, i mean check whole row its completly irrelevant. i think you should revise it. ⚠⚠

TheKumarAshwin
Автор

@14:46 genres.sum() is also including categorical columns. is there some update ?

idkwhat
Автор

Amazing Session sir, Im from pakistan and im watching your videos, can i buy your course from pakistan?

kashifchaudhary
Автор

hello, good morning sir and campusx team, I want to ask that if I am not a paid student, then can I access that python practice question set and interview question set ? if not, then what if I paid for only on youtube not on campusx website, in this case can I get that question set ?

anandtalware
Автор

31:20

i dont think nitish, you cleared all concepts and the answer would be :

result = imdb.groupby('Genre')[['Series_Title', 'IMDB_Rating']].sum().sort_values(by='IMDB_Rating', ascending=False)

this give Series_title concatenated to avoid i did this:

imdb[imdb['Genre'].isin(result.index)][['Genre', 'Series_Title', 'IMDB_Rating']].head()

this might bit complicated: isin(result.index): This checks whether each element in the "Genre" column is present in the index

TheKumarAshwin
Автор

'DataFrame' object has no attribute 'append'. How can I do this?

ankitsaurabh_
Автор

where is the csv file ??? please tell me

virendratrivedi
Автор

imdb.groupby('Genre').max().sort_values(["IMDB_Rating"], ascending = False) i used this query to find highest rated movies of each genre so is it right or not

KhushbuSharma-lxyn