Python Scrapy SQL BS4

preview_player
Показать описание
Web Scraping Recruitment Sites where the format of the site is unknown, so we cannot write selectors to pick out the parts of the page that we require.

Scraping one site, but it redirects to 3rd party recruiters, so by following the links we won't know the structure of the page...

Instead we use bs4 and parse the text for storage in a database and future use with some NLP.

We use:
▣ Python 3.9
▣ Scrapy 2.5
▣ Postgresql (11.11 (Raspbian 11.11-0+deb10u1)) - on Raspberry Pi 4

= Chapter Timings =
0:00 Introduction
2:25 Frozen GUI / runlevel 3 rescue
7:40 Code
11:11 parse_detail
18:19 View output

I want to automate the reading of job listings as many are irrelevant and a model could rate them based on some training data and only recommend the relevant ones.

▣ We want 100+ job adverts to use for training data.
▣ The plan is to use a 'bag of words' and CountVectorizer
▣ Logistic Regression / sklearn

In the video also show how to switch to runlevel 3 in Linux to save some code just in case the frozen GUI loses the changes, and explain how Windows 10 has told me to uninstall my VM Workstation after several years.

If M$ don't want me to use my VM I will use Linux on hardware and delete Windows 10.

My video editing may suffer short term, but apart from that Scrapy and web scraping is much more robust from command line Linux.

Visit redandgreen blog for more Tutorials
=========================================

Subscribe to the YouTube Channel
=================================

Follow on Twitter - to get notified of new videos
=================================================

Buy Dr Pi a coffee (or Tea)

Proxies
=================================================
If you need a good, easy to use proxy, I was recommended this one, and having used ScraperAPI for a while I can vouch for them. If you were going to sign up anyway, then maybe you would be kind enough to use the link and the coupon code below?

You can also do a full working trial first as well, (unlike some other companies). The trial doesn't ask for any payment details either so all good! 👍

◼️ Coupon Code: DRPI10
(You can also get started with 1000 free API calls. No credit card required.)

Thumbs up yeah? (cos Algos..)

#webscraping #tutorials #python
Рекомендации по теме
Комментарии
Автор

Man, congrats on 1000+ subs! You did it! Waiting to see ads on your vids so you can finally get something back from all the work you've put into it.

P.S. Your face on thumnail is super cool - like a James Bond of hacking!

monkey_see_monkey_do
Автор

Hello Dr PI
I have a questioin please check the email

lakchchayamdivyakhare