Python Web Scraping Tutorial: scraping dynamic JavaScript/AJAX websites with BeautifulSoup

preview_player
Показать описание
This Python Web Scraping Tutorial is about scraping dynamic websites, where the content is rendered by JavaScript.
For this Python Web Scraping Tutorial I used the Steam Store as an example. Because Steam website is an example of heavy JavaScript/AJAX driven website with dynamic content.
To scrape Steamstore website with Python I used only Python Requests and BeautifulSoup (bs4) libraries. With further exporting scraped data to a csv file.

This web scraping Python tutorial is the detailed explanation of how to scrape JavaScript driven pages and websites with Python and BeautifulSoup library for absolute beginners.

To install BeautifulSoup, Requests and Lxml:
pip install bs4 requests lxml

Follow me @:

======================================
📎️ The SOURCE CODE is available via Patreon:
======================================

Timecodes:

00:00 - Beginning.
01:09 - Preliminary research (what to scrape)
03:15 - Creating a function that performs GET requests to Steam Store
06:01 - Server response research: what url should be passed in to the get_html() function
09:24 - The scraping plan
09:43 - Getting all Steam Store games with Python Requests, and BeautifulSoup. Scraping pagination.
12:40 - The algorithm of scraping all pages using the pagination GET requests
16:35 - Scraping data of a certain page with games
25:30 - Scraping hovering data of all games on each page, including the data from the hovering window
38:40 - Writing Scraped data to a CSV file

✴️✴️✴️ Also can be useful ✴️✴️✴️

✴️✴️✴️ Web Scraping course ✴️✴️✴️
is available via Patreon here:

or its landing:

✴️✴️✴️ PLAYLISTS ✴️✴️✴️

🔹Django 3 Tutorial: Blog Engine

🔹Kivy Tutorial: Coppa Project

🔹Telegram Bot with Python (CoinMarketCap)

🔹Python Web Scraping

➥➥➥ SUBSCRIBE FOR MORE VIDEOS ➥➥➥
Red Eyed Coder Club is the best place to learn Python programming and Django:

Python Web Scraping Tutorial: scraping dynamic JavaScript/AJAX websites with BeautifulSoup

#python #pythonwebscraping #beautifulsoup #bs4 #redeyedcoderclub #webscrapingpython #beautifulsouptutorial
Рекомендации по теме
Комментарии
Автор

What video should I make next? Any suggestions? *Write me in comments!*
Follow me @:

Help the channel grow! Please Like the video, Comment, SHARE & Subscribe!

RedEyedCoderClub
Автор

Another FANTASTIC topic, amazing! I absolutely love the niche topics you select, thank you so much for sharing your good knowledge my friend.

EnglishRain
Автор

that was exactly what i was looking for, thanks man

ticTHEhero
Автор

finally, i have found you!
thx for videos.

ntufgli
Автор

Best video ever ...I will follow your channel from now on

JoJoSoGood
Автор

That is what exactly I'm searching for! Thank you, man!

bingchenliu
Автор

awesome. Always had problem with infinity scroll and used Selenium. Now I know how to do it with bs4 thanks to you, cheers :)

youngjordan
Автор

Such a great tutorial! Thank you for that!

igorbetkier
Автор

very useful lesson, thank's for your job!

rustamakhmullaev
Автор

Tried to use this method with Reddit comment search and it doesn't work - the requests it sends are POST requests. So no conveniently available URL on them which you can use.
The requests themselves are JSON objects.

Shajirr_
Автор

I need to scrape data from walmart, which is all in JavaScript . I'm going to watch and try this tomorrow, hopefully it works!

noelcovarrubias
Автор

thank you a lot this was really helpful to me thanks again

amrhamza
Автор

Excellent - best video on xhr (gets) nthat i have seen..great work
Could you do a video on xhr (posts) please?

joeking
Автор

Good job.
Thanks for video.
I'm click like

qzdkujk
Автор

This search returned 779 results when the video was released. Now, it returns 4927 results.
Just to put into perspective how much garbage is being shovelled onto the platform.

Shajirr_
Автор

Hi, thanks for this, but I am encountering the website using "Post" method instead of "Get" in the Request Method, thus not able to replicate what you are doing by scraping the IDs first and copy into urls. The page is just constantly loading and then eventually said page not found. Is there a way to bypass this?

JackWQ
Автор

great video really well explained. please can you make video showing login/sign in to website with Request sessions and OAUTH

ThEwAvEsHaPa
Автор

Спасибо большое!) А не планируешь ли серию уроков по scrapy? Ну и второй вопрос, можешь ли сделать урок по созданию на джанго самонаполняющегося агрегатора(новостей/товаров и т д)? Чтобы сайт сам парсил и заполнял себя. Пытаюсь такое реализовать на джанге и скрейпи. Но проблема с запуском парсера из джанги так, чтобы процесс не блокировался. В итоге привинтил celery, но с ним тоже возникают сложности(reactor ошибку выдает). Или мне не стоит на этом канале на русском писать?

MrYoklmn
Автор

Very, very good video on this topic. The way you are explaining the things helps understanding the whole process behind getting the data! I am trying to access the data on various sites, but sometimes I get an error message that I "do not have the auth token" or "access denied!".. How can I bypass those?

duckthishandle
Автор

I have a challenge for you: 😜 Can you login to WhatsApp Web using Requests library without manually scanning the QR code & without using Selenium? I achieved it using Saved Profile in Selenium but just curious if you can do it using Requests library. Thanks!

EnglishRain