How to Automate Web Scraping Task with BeautifulSoup — Python Task Automation with CronJobs (part 1)

preview_player
Показать описание
Hello Everyone! My name is Andrew Fung, in this video, I will be showing you how to automate your everyday web scraping tasks with beautiful soup and schedule it to run on a regular basis with cron jobs on Mac. In the first episode, I will be creating a new python web scraping to fetch the waitlist position dynamically from a web page that will update the same time every week.

Problem statement: I am currently on a waitlist in my uni's hall application, and my waitlist can be found from the university webpage's pdf link, which will be updated at the same time every week. Thus, I will run a cron job to fetch my waitlist position one hour after the weekly update from the web page and send me an email of my new position.

#webscraping #beautifulsoup #python #cronjob #taskautomation

How I make my YouTube videos:

Installation and Setup!

Source code for this project:

Check out my Github!

Timestamps
-------------------------------
0:00​ | Intro
3:15 | Import packages
5:05 | Search for PDF Link with CSS Selectors
15:25 | Load PDF with Tabula-py
17:43 | Find waitlist position by ID
23:42 | Turn it into a function
25:32 | Out tro
-------------------------------

Feel free to drop a like and comment if you enjoy and video and let me know if you want me to do other types of programming videos ;) !!!
Рекомендации по теме
Комментарии
Автор

Nice tutorial! Curious if HasData's general API could simplify some of these scraping tasks?

maueucifeely
Автор

Thanks for sharing! I wonder if HasData has any built-in functions for managing cron jobs as well?

DevinWeis-vt
Автор

Great video
But where is part 2 with the CronJob

oceansblue
welcome to shbcf.ru