filmov
tv
Python Tutorial: Creating a DatetimeIndex

Показать описание
---
In the last exercise, you fixed the data type of the is_arrested column. Now, we're going to build a DatetimeIndex for our DataFrame.
Let's take a look at the head of the dataset again. As you can see, the date and time of each traffic stop are stored in separate columns, both of which are object columns.
Because we'll be using stop_date and stop_time in our analysis, we're going to combine these two columns into a single column and then convert it to pandas' datetime format. This will be beneficial because unlike object columns, datetime columns provide date-based attributes that will make our analysis easier.
Let's see an example of this using the apple stock price DataFrame from the previous video. Date and time are stored in separate columns, so the first task is to combine these two columns using a string method.
As you might remember from previous courses, string methods, such as replace(), are Series methods available via the str accessor. In this example, we're replacing the forward slash in the date column with a dash. It outputs a new Series in which the string replacement has been made, though this change is temporary since we haven't saved the new Series.
Anyway, to combine the columns, we're going to use the str dot cat() method, which is short for concatenate. We'll concatenate the date column with the time column, and tell pandas to separate them with a space, storing the result in a Series object named combined.
You can see that the combined Series contains both the date and time. It's still an object column, but it's now ready for conversion to datetime format.
To convert the combined Series to datetime format, you simply pass it to the to_datetime() function, and store the result in a new column. We didn't even need to specify that the original data was in month-day-year format, instead pandas just figured it out.
Looking at the updated DataFrame, you can see that the new column contains both the date and time, and that it is stored in a more standard way. From the dtypes attribute, you can see that the new data type of the new column is datetime, instead of object.
One final step that we'll take is to set the datetime column as the index. That will make it easier to filter the DataFrame by date, plot the data by date, and so on. We'll use the set_index() method, and specify that the operation should occur in place to avoid an assignment statement.
You can see that the default index has been replaced with the datetime column. And the index is now a special type called DatetimeIndex.
As a reminder, when an existing column becomes the index, it is no longer considered to be one of the DataFrame columns.
Now that you've seen how to create a DatetimeIndex for the apple DataFrame, you can practice these steps on our dataset of traffic stops. This is the final step before we begin our analysis of the dataset in the next chapter.
#PythonTutorial #proper #data #Analyzing #pandas
Комментарии