filmov
tv
Web Scraping with AIOHTTP and Python
Показать описание
AIOHttp is a client and server side library for Python 3.6 and above that enables us to create http requests asynchronously. It’s fully featured allowing sessions, cookies, custom headers, and everything else you’d expect to see - so naturally I thought it would be a useful tool to share for creating more advanced web scrapers.
When we are scraping data from the web the chances are we will need to make multiple requests to the server to extract the information we are after, given that each of these requests takes time we find that our code is effectively sat waiting for the response from the server before making the next. This slows the process right down. In its simplest form AIOHTTP allows us to use the Python asyncio library to send vast numbers of requests in a short amount of time, letting us create faster and more efficient web scrapers.
Support Me:
-------------------------------------
Disclaimer: These are affiliate links and as an Amazon Associate I earn from qualifying purchases
-------------------------------------
#Timestamps
00:00 Intro
01:17 Docs
02:12 Demo Code
03:54 Web Scraper
09:38 HTML from each page
10:00 Parse HTML
12:10 Expanding Discussion
13:21 Outro
When we are scraping data from the web the chances are we will need to make multiple requests to the server to extract the information we are after, given that each of these requests takes time we find that our code is effectively sat waiting for the response from the server before making the next. This slows the process right down. In its simplest form AIOHTTP allows us to use the Python asyncio library to send vast numbers of requests in a short amount of time, letting us create faster and more efficient web scrapers.
Support Me:
-------------------------------------
Disclaimer: These are affiliate links and as an Amazon Associate I earn from qualifying purchases
-------------------------------------
#Timestamps
00:00 Intro
01:17 Docs
02:12 Demo Code
03:54 Web Scraper
09:38 HTML from each page
10:00 Parse HTML
12:10 Expanding Discussion
13:21 Outro
Комментарии