Craigslist Scraper with Python and Selenium: Part 2

Показать описание

In this video series, we will be writing a script in Python using web scraping modules such as Selenium, Beautiful Soup, and urllib to extract information from the website Craigslist. Specifically, this script will be responsible for forming a query to search, i.e. a set of criteria such as items to look for in a given location, zip code, etc. Once we form this query, we use our script to automatically perform a search and extract two key pieces of information from this search. Namely, we will extract the titles of each of the postings along with the links for each post.

This project is purposefully simple as it will optimistically serve as a springboard for you to build upon. For instance, perhaps you want to keep tabs on when a certain item is listed in your area. Perhaps you could modify the script to automatically email you if any items of interest pop up in your area. The possibilities are quite vast, and I hope you use this to build something useful and cool. If you do, please share it!

Related Links:

This video is part of a larger series on "Web Scraping and Automation". You can watch the other videos in this series here:

Further videos on Selenium:

Do you like the development environment I'm using in this video? It's a customized version of vim that's enhanced for Python development. If you want to see how I set up my vim, I have a series on this here:

If you've found this video helpful and want to stay up-to-date with the latest videos posted on this channel, please subscribe:

Рекомендации по теме

Комментарии

Loved it man... Very clean instructions.

rayhansardar

good shit. subscribed. you have a great style of teaching. havent seen the rest of the series yet but im sure you know that its not necessary to launch the browser when using urllib.request. im guessing you used those two different functions to showcase two different technologies. launching the browser slows everything way down. if thats addressed later on, disregard :)

DocPosture

Will this work in IDLE? Part 1 worked okay but now I'm getting a bunch of errors (when I run at 7:58). Thanks

Zooooman

You can improve the technique on Beautifulsoup by passing the HTML content of the driver

JadaKingdom

I was able to springboard this to scrape all of the relevant data multiple pages of a craigslist search result. def extract_post_titles is able to get all of the data from one page, then the next, and so on, but def extract_post_urls is stuck on getting the html links of listings only on the first page. I put def extract_post_titles in def load_craigslist_url so it will get the stuff when we go to next page, but def extract_post_urls is stuck on the first page since self.url is static and doesn't dynamically change every time we go to a new page. Any recommendations to modify def extract_post_urls to get html links of listings on every page or having self.url change every time we go to next page?

uqyuiyryiq

Can you explain why you choose 'searchform ID' as the wait.until parameter? what's the benefit from setting that?

liuxu

Can’t you do the same thing with Selenium without having to load the page twice and parse it with bs?

snoopyjc

Appears that the `wait` or `delay` is unnecessary. Making the `delay = 0` will not throw any exceptions.
As in, regardless of code, the page will take it's time to load completely. Why is this?

simonj

It's giving me an error when I try to extract_post_urls. It's raising an HTTPError: Bad Request.

Any help?

iNotSoTall

Craigslist Scraper with Python and Selenium: Part 2

Craigslist Scraper with Python and Selenium: Part 1

Python Craigslist Web Scraper with AI decision making

Craigslist Scraper with Python and Selenium: Part 3

Craigslist Scraper with Python and Selenium: Part 2

How to Scrape Craigslist | Video Tutorial #craigslist #scraping #webscraping

Web Scraping with Python BeautifulSoup & Requests | Web Scraping Craigslist Titles

Python Web Scraping Tutorial - CraigsList

Python Scrapy - Scraping craigslist/realestate

How to scrape Craigslist

3. Python Web Scraping 101 - Craigslist Job Titles

Web Scraping with Python BeautifulSoup & Requests | Web Scraping Craigslist Job Details Wrapp...

Master Web Scraping: Extract Craigslist Listings with Python Like a Pro!

Python Project For Beginners | Browser Automation & Web Scraping Craigslist (Full-Tutorial)

Showing a Craigslist scammer who's boss using Python

Web Scraping with Python BeautifulSoup & Requests | Web Scraping Craigslist Job Description ...

How to scrape data from Craigslist easily

Python Project For Beginners: Create A Web Scraper To Scrape Craigslist

Web Scraping : using requests to pass search parameters to craigslist

Python Web Scraping | Store CraigsList Prices with Beautifulsoup

Scrape Jobs From Craigslist

Craigslist Data Scraper | Craigslist Data Scraping Bot | Unlimited data without getting blocked

Scraping Data From Craigslist

How to Scrape Craigslist Data: Listing, Prices and Details (2020 Tutorial)

2 ways to Scrape Craigslist for Leads With The Automated REI