Web Scraping Football Matches From The EPL With Python [part 1 of 2]

preview_player
Показать описание
In this video, we'll learn how to scrape football match data from the English Premier League.

We'll download all of the matches for several seasons using Python and the requests library. We'll then parse and clean the data using BeautifulSoup and pandas. By the end, we'll have a single pandas DataFrame with all of the EPL matches for multiple seasons.

In the next part of this series, we'll use the data we scraped to predict which side will win each match.

Chapters

00:00 Introduction
01:21 Scraping our first page with requests
05:07 Parsing html links with BeautifulSoup
10:40 Extract match stats using pandas and requests
14:21 Get match shooting stats with requests and pandas
18:09 Cleaning and merging scraped data with pandas
22:07 Scraping data for multiple season and teams with a loop
35:42 Final match results DataFrame and next steps

---------------------------------
Join 1M+ Dataquest learners today!
Master data skills and change your life.
Рекомендации по теме
Комментарии
Автор

Outstanding tutorial with concise explanations for each line of code! Great for both beginners and advaned pandas users

nikolavladimirov
Автор

Love your teaching style. Thanks for this content!

jonathanchagolla
Автор

I've always wanted to work on a project on football since it's my favorite sport, this is a good starting point. Love your pace as well 🙏🏽.

mementomori
Автор

Nice explanations, Vikas! The combination requests + Beautiful Soup + pandas is fantastic! Thanks! Greetings from São Paulo, Brazil!

JoaoSantos-jbul
Автор

thank u vikas paruchuri...this video saved me...greetings from pakistan...teaching style very good!!!

tifk
Автор

Something I did that may be useful for other people: I added a comment before every line/block to tell future me what I was doing.

Great video!

everflores
Автор

This is a great tutorial. I tried following along but instead of team stats tried extracting player stats for the season. fell over on the last hurdle of the loop. But going to give it another go this evening. Great content, thank you

samcrowson
Автор

Really really enjoy your content. Love the examples. Love the teaching style. Love the explanations.

imfrshlikeuhh
Автор

Tip: When web scraping assign the
html code to a variable or copy it to a
notepad as a text file before the site
you're working with kicks you out for
exceeding max requests.
Learned this the hard way lol 🥴

waves
Автор

I went at it with a different approach. I started with the year I wanted to start with and did 'next season'. that way the dataframe is in chronological order. Otherwise it would read the August 2022 to may 2022 and then previous season is scraped thus Auguest 2021 to may 2021 follows.

RubensBarrichello.
Автор

This was very useful! Thank you. I also had issues with the Premier League data so scrapped La Liga instead which worked fine. Will now attempt to follow the second part!

benjaminhorn
Автор

Regarding the standings_table =
IndexError: list index out of range ' error - Fbref limits scraping by blocking users who send more than one request every three seconds, so i think it is important to use the time.sleep function. if you get this error (like me) I believe you just have to wait some time. But will update if this works

TrixsterProductions
Автор

Wonderful teaching, wonderful project, so easy to access the knowledge, THANK YOU!!!😊

migi
Автор

Thank you so much! I've been putting off scrapping data online forever. Finally did it, thanks to you

DamilolaAyodele-wqsu
Автор

Adding l = links at 8:22
saved the day! Thanks for the video!

rodneymawero
Автор

Really enjoyed this walkthrough! Thank you for sharing!

bencole
Автор

you are a good teacher clear and precise and i wish you all the success in the world. thank for the info

kevinr
Автор

All your videos have helped me a lot.
Thank you very much for your videos, I learn a lot.
Thank you for this content that you upload 😊

principeabel
Автор

Thanks for the motivation. I wasn't sure if I could do it, but I might try it eventually.

tamzen
Автор

This is really awesome. I learnt alot.
I'm having issues scrapping multiple years though. Something about remote host cutting off the connection

simmiesanya