#26. Regular Expressions - 2: Walkthrough example in Python | Tutorial

preview_player
Показать описание
The video walks through an example to extract data from a text string using regex or regular expression in Python.

Timeline & Data
(Python 3.7)

00:00 - Welcome
00:10 - Outline of video
00:48 - Open Jupyter Notebook
01:10 - Data
01:39 - Get text
04:00 - Remove blank spaces
05:50 - Begin work on company information:
06:48 - Use regex to extract: names
08:15 - Use regex to extract: email
10:28 - Use regex to extract: domain
11:49 - Use regex to extract: phone
14:02 - Use regex to extract: sapling counts
15:59 - Use regex to extract: volunteer counts
17:40 - Create DataFrame to combine information about company
20:20 - Begin work on forest information
20:35 - Use regex to split multiline string into multiple sentences
21:37 - Add multiple sentences to a DataFrame
24:00 - Use regex to extract: billion $
27:00 - Use regex to extract: reasons
29:02 - Use regex to extract: birds
30:01 - Use regex to extract: animals
30:33 - Use regex to extract: years
32:58 - Use regex to extract: market size
34:25 - Create new DataFrame for annual data
36:05 - Extract and add animals to DataFrame
39:20 - Extract and add birds to DataFrame
41:03 - Ending notes


#########
# Data
#########
s = '\ [***Note***: This is a made up text to explain Regular Expressions in Python. It is NOT real.]\
This year in 2020, the rainfall in this region of the forest is 2% more than the rest the forest as compared to last year. \
And the primary reason we think is because of increased tree plantations since the year 2005. \
Most notably the organizations such as "Save Forest", "Save Planet", "Sun & Rain" have \
made an important contribution. Annually each of these organizations have planted 1000000, 500000 and 200000 \
saplings. And about 25% of those have now grown into tall magnificient trees. The survival rate\
of such saplings is lower because of hot and dry summer temperatures that are upwards of 113 deg F (or 45 deg C).\
There were 500, 245 and 793 volunteers from each of \
organizations. The new forest canopy also provides a lush green habitat to support wildlife.\
We can now see species of birds such as parrots increase in numbers fro 500 to almost 2000.\
Few species of animals such as monkeys have also grown in population from 200 to around 400.\
This is all very encouraging, however there still lies one problem. The rest of the forest\
outside this 1000 sq km has a rocky terrain. Most of the soil was washed out by rain water\
because there was no vegetation to hold it in place. The lack of vegetation in those area is\
likely a consequence of rapid deforestation. Over past 5 years, more than 1 trillion trees have been cut down \
in this forest for various reasons such as global demand for wood, clearing land for large farmlands, and \
consequently reduced rainfall. The estimated global market for wood has increased from $60 billion in the year 2005 to \
$300 billion in the year 2020. Therefore, such incresed efforts to plant more trees to save forests and also meet\
increased demand in global market are needed. Save forests! \
Contact us: \
'
##############

Resource:
You may also want to check out this site for practicing regex:
Рекомендации по теме
Комментарии
Автор

Good video! Do you know how i can create a blank space between the information given in a column in pandas? I have something like:
I need to make a space between the company, employee name, email and so on

Borzacchinni