Web Scraping 5: Reading All Blogs on One Page with Scrapy & Python (Scrapy Series)

Показать описание

In this video, we will learn how to get all the blogs from a blog archive page of a Wordpress blog.

This is the fifth tutorial in the series that will teach you Web Scraping with Scrapy - The most powerful scraping library.

This is part of the Scrapy Crash Course. Take the full crash course for FREE:

Once you know the basics, learn how to Download all Files from any site with Scrapy.
Take the FREE course:

What is Web Scraping?
In a nutshell: Web Scraping = Getting Data from Websites with Code

What is Scrapy?
Scrapy is a Python library to make web scraping very powerful, fast and efficient.
There are other libraries too like BeautifulSoup, for web scraping. However, when it comes to true power and flexibility, Scrapy is the most powerful.
Why Learn Scrapy?
- Most powerful library for scraping
- Easy to master
- Cross-platform: doesn't matter which OS you are using
- Cloud-ready: Can be run on the cloud with a free account

Most Important: You would be able to earn by taking up some of the web scraping gigs as a freelancer right away.

-~-~~-~~~-~~-~-
Please watch: "Making Scrapy Playwright fast and reliable"
-~-~~-~~~-~~-~-

Рекомендации по теме

Комментарии

5:06 Update the code to use the correct selector
summary = ::text').getall())
Thanks to @Byte Riddler for pointing it out.

codeRECODE

nice exercise, I did the paragraph using BeautifulSoup

kenrosenberg

Please *SUBSCRIBE* and *Like* to make YouTube algorithm happy!

Please leave a comment with your questions, suggestions, or just a word of appreciation.

codeRECODE

thank you very much, NICE course and explanation

mahrouch

Hello,
Thank you for sharing your knowledge. I have a couple of questions though.

I noticed that you quickly scrolled past the "Introduce Yourself" summary being null.
I have been using xpath instead of the css selector
and i managed to get a result for that summary but it is only the first word, i assume this is because of the "em" tags inside the "p" tag, i have tried using the multiple path selections for xpath with no luck (maybe not using correctly?).

question:
1. how would we go about getting the text in the p tag when it has multiple tags inside it?
(note: i have omitted the leading bracket of the html tags)
EG: p> some text i> IS /i> em> important /em> and some text sub>is /sub> not /p>

so far the only possible solution i can see is multiple selections and joining of strings, surely there is an easier/better way?

resultant output in the JSON file contains many unicode (non-breaking space, in this case)entries.
the quotes JSON also contains multiple unicode entries for left and right double quotes.

question:
2. how would we go about removing these, either before or after writing the JSON file?

once again thank you for sharing your knowledge and experience.

ByteRiddler

sir how were you able to format multiple boxes of sentences
i tried ctrl+shift+alt + pg dn
but it selected only length of first box(title)

SaurabhKumar-ygfe

Hello sir, i tried following your steps but when selector gadget runs for the title, class name that i get is .entry-title and its not yielding any output.
When i did it to .entry-title a, i got the output..
I dont understand why its showing only .entry-title and not 'a' in the end
Can you please check and confirm ? Thanks a lot :)

virajpatel

Web Scraping 5: Reading All Blogs on One Page with Scrapy & Python (Scrapy Series)

Web Scraping 5: Reading All Blogs on One Page with Scrapy & Python (Scrapy Series)

Scraping Data from a Real Website | Web Scraping in Python

Industrial-scale Web Scraping with AI & Proxy Networks

Beginners Guide To Web Scraping with Python - All You Need To Know

Web Scraping to CSV | Multiple Pages Scraping with BeautifulSoup

7 Inv201 03 01 Web scraping 5 Yrs EPS for all US Stocks

Web Scraping project of your Nightmares

BeautifulSoup + Requests | Web Scraping in Python

Scraping and Downloading Images with Python | Web Scraping for Beginners Part 5

Is Web Scraping Legal? (Legal Analysis)

Beautiful Soup 4 Tutorial #1 - Web Scraping With Python

CP12 - Java Web Scraping 5 Project Tips

5 Functions for Web Scraping in Google Sheets

Python Web Scraping is Easy | Scrape Data with 2 Lines of Code!

Much Better Web Scraping with Pandas - Automatically Extract All Table Elements From a Web Page!

Web Scraping Explained in 5 Minutes | Web Scraping Explained

Web Scrape Text from ANY Website - Web Scraping in R (Part 1)

Inspecting Web Pages with HTML | Web Scraping in Python

Python AI Web Scraper Tutorial - Use AI To Scrape ANYTHING

Web Scraping With Python 101

Web Scraping Tutorial | Data Scraping from Websites to Excel | Web Scraper Chorme Extension

Web Scraping With Python and Beautifulsoup | Get All Words From Online Dictionary

Web Scraping NEWS Articles with Python