filmov
tv
Query and Eval | #15 of 53: The Complete Pandas Course
Показать описание
--------------------
We have seen various ways of filtering out the desired columns of a data frame and to create new columns also, that as we have seen, the dot I lock, lock at and dot i add methods, all these methods we have seen using this you can create, basically, I lock and lock using this, you can create new columns, as well as filter the different rows. Now, besides these approaches, pandas also provides a query dot query method.
And dot eval method using both of this, you can extract out that desired rows out of your data frame. All right, so let's see how to do that. Now inside the query function, you need to write a SQL SQL like query. So here, this query will typically use if you're using a name here, here we are using state this represents a particular column in a data frame. Based on this condition that you're writing here as a string, pandas will internally pass it and give you back the rows that satisfy this condition.
In a similar manner, you can combine more than one condition also. So here we have this first condition joint with n and you have the second condition. So both of these conditions should be true for a row to be selected. Now, notice here, this particular column name has a space inside it, whenever a column name has a space, you need to put that column name within a pair of backticks. backtick is an operator that is present on top of the tab key on your keyboard.
Right, so you need to take care of that. Another aspect is sometimes you would want to refer to a variable that is not part of a data frame. Suppose you want to write a new query here, here, I want to have a new query that is referring to a variable that is present in my Python environment Monday calls that variable I'm creating here, right, I want to use this variable inside my query. To do that you need to add this add symbol, then pandas will know what this particular variable refers to.
So just do things use backticks, when you're having a space use add symbol to refer to a variable that is not part of your data frame. Let's run all of this and see the output. So here are all the rows contain either case or row which as you have stated in this condition. Likewise, this also works. And this also works. Now typically, the time taken to run the query in, in most cases, I have noticed this is not a perfect tool or anything. In most cases, it is slightly lesser time than what our dot log function will take. Right? It's not always the case. The point is, this is not a slower method of writing your query.
Alright, this is equally fast as dot log or dot ilok. Second thing is that dot eval function, the difference between query and eval is query will always return the rows that are qualified as per the query that you have written. Whereas eval will evaluate the statement that you have given. So if you had used query instead of eval, query will return you the rows that are qualified, right? But eval will evaluate this and give you a Boolean mask. Basically, on evaluating this particular notation, it will it is basically an yes or no for each and every row, right. So that is the Boolean mask that is giving out, you can use this Boolean mask to do further filtering like this. Alright, so that's what eval does.
This is one example. Let's create a Boolean mask, you can use eval to create a new column also. So inside this dot eval, write your statement here, I'm creating a new column called minutes per call, I'm using an expression here, this gets evaluated and stored into minutes per column basically evaluates the string and works it out.
And if you want that change to persist inside the data frame use in place equals to true. So let's run this and see the result. So this is a Boolean mask. Here, if you look at the output of mask, this returns a Boolean mask of trues and false, right. Likewise, let's evaluate this. This should create a new column called minutes per call. This will form at the end of this data frame over here.
Комментарии