Amazon Review Scraper Full Project in Python

preview_player
Показать описание
In this episode of Weekly Web Scraping we look at scraping Amazon product reviews. Using Requests-HTML we can extract and save to CSV the top reivew data for each product. We use CSS selectors, list comprehension, write own on functions and work through how to get the data from the website to a CSV file.

# Timestamps

00:00 - Intro
01:15 - Data to scrape
02:10 - Getting the data
04:25 - CSS Selectors
07:07 - List Comprehension
08:14 - Get ASINS function
09:22 - Product Page
19:37 - Complete Data
21:31 - Main() function
23:28 - Saving to CSV
24:15 - Overview
25:02 - Troubleshooting
26:00 - Running Demo
27:15 - Another error
28:00 - Our CSV file & Outro
Рекомендации по теме
Комментарии
Автор

I was driving myself nuts when I kept returning a NoneType, until I realized I needed the r.html.render(sleep=1) in my functions.

Hope this helps anyone that experiences the same issue.

SunDevilThor
Автор

Scrape God! Always dropping knowledge! 💯

profoundgenius
Автор

Great and informative videos. We appreciate more complex website scraping like social media website.

abukaium
Автор

Perfect, nice and neat tool to search and quick insights...I think to avoid being blocked, random delay in between functions...could trick the website\backend server

CodePhiles
Автор

great video as always! you make it look so easy!

Msr
Автор

Hey John great video. I have a question. Why did you choose to only use the requests library for this project instead of bs4 library. Also, when is it better to use a combination of both libraries. Thanks for your videos man you are truly making a difference !

fabianrestrepo
Автор

Really helpful video, thank you so much.

aiyshakhan
Автор

thank you so much, learning a lot
keep up the good work :)

hardik
Автор

Nice series and video timestamps are very helpful. 💖

tubelessHuma
Автор

I found my new Python Guru!!! Thank you!

solomonbirru
Автор

I really love these videos John. I was wondering, do you often use a scraper that runs periodically or in a cloud? Would love to see a video about that! ;)

Kyosika
Автор

Thank you John for the knowledge. I want to ask which technique for crawler to work best for every e-commerce website, I want to crawl the Url (img), Title, Price of the product. I have used a lot of ways but there are always problems in some websites. Let me know the best way to do it.

RapLitVN
Автор

Thank you for the video, it was really helpful. And everything is explained very cleanly!!
I also want to scrap images from amazon reviews, that customers post along with reviews. How should I go about doing that?

sameekshagupta
Автор

I still get blocked if I try to scrape Amazon. I even tried the user agent headers tip from your previous video. This video is too recent for something to have changed in a few days, so I wonder what's different on mine that causes this.

mattmovesmountains
Автор

Like every day like before watching
Sir can you add a video about scraping aliexpress reviews i try it but i can't scrap any review

geektom
Автор

Are you able to scrape amazon continuously without getting blocked? Would love a tutorial on making many requests and scraping lots of products.

mehmetcanturgut
Автор

It would be great if you multithreaded that function so that it can read faster.

fraddygil
Автор

i cant get

print(rev.find('div [data-hook=review-title] span', first=True).full_text)
print(rev.find('i[data-hook="review-date"]', first=True).full_text)

to work. the have change something.. any ide ?

SpadenOnpc
Автор

how would you scale this up for a few thousand products?

kusunagi
Автор

I keep receiving the 503 Error. Some say Amazon blocked web scraping. Also I had to add my user agent as headers

maybenew