Filtering Data That Satisfy Conditions | #13 of 53: The Complete Pandas Course

preview_player
Показать описание

--------------------

You know how to select specific rows and columns from a given data frame. Now let's see how to filter rows and columns that satisfy one or more of your conditions, you might have certain conditions. For example, you want to filter out all the rows where the account length this particular value is greater than a certain number. Or it could be the case that you want to filter out all the columns that start with a certain character, or any search rule, it could be this rule, it could be any such rule, you can easily do that.

All you need to do is, in order to filter out rows you need to create what is called a row Boolean mask. A Boolean mask is nothing but a vector, or a series that contains only true or false row Boolean mask means the Boolean mask that you create should have as many items as there are number of rows in a data frame.

Likewise, a column Boolean mask means a column Boolean mask means you should have as many items as there are columns in your data frame. Now, if you want to select specific rows, say you want to select all the rows where a count length is greater than, say, 100. To do that, you need to create a Boolean mask that should have the value true in all the places where your account length these values are greater than 100.

In places where it is not, it should have the value false. Such a Boolean mask Once created, you need to pass in to that data frames, pass into the data frames first row argument, this should be your Boolean mask. Likewise, for your column also, you need to pass in your column Boolean mask. Once you do this, all the rows that has true will be selected, all the columns that has true will be selected.

That's how you do conditional filtering, basically. So let's work this out. For instance, you want to create a new data frame that contains all account length values to be greater than 100. So this is how you create a Boolean mask.

def account length greater than 100 is going to give you a series that looks like this, all the places where your account length is greater than 100, we'll have to write likewise, say you want to extract all the columns that starts with the D that starts with a D You can access this str we will come to this we will look at this in more detail later on in the course.

But understand that for text columns, there will be an str attribute. Under that attribute, you will have all the basic Python strings, all those will be present inside this str.
Рекомендации по теме
Комментарии
Автор

Want to learn end-to-end Data Science (Machine Learning & AI)?

machinelearningplus
Автор

Till date the best Pandas video playlist with step wise explanation

akashdeep
Автор

Very very underrated playlist about pandas.
Sk good.

idopshik