Python Tutorial : How to remove unwanted characters in a data frame

preview_player
Показать описание
The ability to data cleanse and remove unwanted characters in a data frame to have clean data in excel, is covered off in this video. We talk through a simple scenario where there is a dataset of numbers, which cannot have anything else but numbers, but we include invalid characters that need to be removed. Firstly we identify the errors through the use of a regular expression. Then based on a list of predefined mistakes, we fix those errors found and output the cleansed numbers to a new column.

⏲⏲⏲TIMESTAMPS⏲⏲⏲
Beginning 00:00
Intro and overview 00:01
Create the dataset 01:56
Create the dataframe 03:34
Create a function to find errors 04:01
Review of output and fixes 11:31

Blog posting:

################ Lets be Social! ##################

#dataanalytics #python #dataanalyticsireland
Рекомендации по теме
Комментарии
Автор

All right, learning python now so that got me subbed:)

VagabondTurtle
Автор

Hi mate, thanks for the video. Question.
1. What is the relationship between the regex and the list variable 'l'?
2. In relation to creating the list you designated as 'l', if we want to filter backslashes, adding a backslash in seems to cause some parsing abnormalities. how do we solve that? To recreate the problem try substituting a backslash to one of those symbols, you should see the colours of the commas or square brackets out of order.

DavidsonLoops