How to Easily Scrape Websites with Python and Beautiful Soup (Web Scraping with Python)

preview_player
Показать описание
In this tutorial, we're going to learn Beautiful Soup (the easiest web scraping library in Python) by solving a basic project that consists in scraping movie transcripts. First, we'll install Beautiful Soup and then we're going to learn how to get the HTML from a website, how to scrape a single page and export it to a txt file. If you liked this introduction course to Beautiful Soup, you're gonna love my web scraping course in Python.

--------------------
Content:
0:00 Intro
0:30 Install Beautiful Soup and Requests
3:46 How to get the HTML from a website
8:45 How to scrape a single page
19:41 Exporting data to a txt file
Рекомендации по теме
Комментарии
Автор

Great example. Just one thought. You start with a known named page. For me, it would be more useful to use the web site index and loop through all pages.

brucewernick
Автор

Thank you Frank for your beautiful lesson

David
Автор

This is a well planned out guide Frank! Well done. Can I ask you to create a video on scraping multiple google search queries, taking the first result link from each query? I can't find anything like this anywhere. Then you could also write the results to a DF / CSV.

davidwisemantel
Автор

Thanks Frank, that is a great video! However it did work at first for me I needed to add 'encoding="utf-8"' in 'with open(f'{title}.txt', 'w', encoding="utf-8") as file:' to be able to print in the file, not sure why ;)

frankbedon
Автор

Thanks to you for this tutorial, learned alot.

aliyanpops
Автор

Thanks Frank for Cheat Sheet PDF, i really" need for learn deep python.

Sweet-tooth
Автор

OMG haha, I love the fact that you have the same video in englsih and Spanish.

DianaMSosa-oilo
Автор

What video about scrapping do you reccommend to see first? this or the previous? thx!

denisquant
Автор

It seems the Python Cheat Sheet on Web Scraping in no longer available!!

georgeabuya
Автор

Hi, Thanks for video. Can u suggest approach to follow to learn Data Science.

UTTAMKUMAR-orhq
Автор

why are you parsing the website with xml?

averied
Автор

How I can rewrite this data before I upload it to my website. Please record video how we can rewrite the data after scrap it

gameplay
Автор

please start from how to install python

wedeyforyou
Автор

Hey team - I had a few errors when running the code:


Traceback (most recent call last):
File "/titanic.py", line 18, in <module>
file.write(transcript)
File "cp1252.py", line 19, in encode
return codecs.charmap_encode(input, self.errors, encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufb02' in position 31207: character maps to <undefined>

informationdominance