filmov
tv
Scrapy Crawler with MySQL & Python | Web Scraping (part 2)

Показать описание
#webscraping #scrapy #sql #mariadb
This demo / tutorial shows how to use Python code to Crawl an entire site with Scrapy, yet only save the 'interesting' links to the mariadb SQL database.
⦿ Web scraping without using any CSS or XPATH! Nice!
I search book titles to find any titles that contain the 'keywords' that I specify in a Python list. (Line 43 in the code).
You will see how I use:
⦿ list comprehension,
⦿ "ANY" function
⦿ SQL commands such as DESC table, TRUNCATE, and INSERT
This is more simple than using Scrapy pipelines, and the code in this video could be applied to other projects outside of Scrapy.
timings:-
0:00 Introduction
5:00 parsing the item
8:00 using the "ANY" function
13:06 mariadb (MySQL)
16:23 Scrapy populates the MySQL database
*Geany didn't display underscores at certain zoom levels, but if you see the GitHub code you'll see that they are present. I suspect this could also be due to me using a VM, which can't access the proper hardware/graphics driver. One day when I have more money I'll get a new PC and run Linux natively!
# Scrapy Crawler Code on GitHub:
# Install the "Connector"
pip install mysql-connector-python
# Install the Database:
sudo apt update
sudo apt install mariadb-server
sudo mysql_secure_installation
# Install phpmyadmin :
sudo apt-get install phpmyadmin
This demo / tutorial shows how to use Python code to Crawl an entire site with Scrapy, yet only save the 'interesting' links to the mariadb SQL database.
⦿ Web scraping without using any CSS or XPATH! Nice!
I search book titles to find any titles that contain the 'keywords' that I specify in a Python list. (Line 43 in the code).
You will see how I use:
⦿ list comprehension,
⦿ "ANY" function
⦿ SQL commands such as DESC table, TRUNCATE, and INSERT
This is more simple than using Scrapy pipelines, and the code in this video could be applied to other projects outside of Scrapy.
timings:-
0:00 Introduction
5:00 parsing the item
8:00 using the "ANY" function
13:06 mariadb (MySQL)
16:23 Scrapy populates the MySQL database
*Geany didn't display underscores at certain zoom levels, but if you see the GitHub code you'll see that they are present. I suspect this could also be due to me using a VM, which can't access the proper hardware/graphics driver. One day when I have more money I'll get a new PC and run Linux natively!
# Scrapy Crawler Code on GitHub:
# Install the "Connector"
pip install mysql-connector-python
# Install the Database:
sudo apt update
sudo apt install mariadb-server
sudo mysql_secure_installation
# Install phpmyadmin :
sudo apt-get install phpmyadmin
Комментарии