Async Python Tutorial: Web Scraping Synchronously versus Asynchronously (10x faster)

preview_player
Показать описание
Are you new to asynchronous programming? Are you past the novice python material and looking to increase your knowledge and experience of python in a variety of contexts? You've come to the right place.

This tutorial builds on the foundations of Async Python we learnt in the first tutorial demonstrating how to write a synchronous and an asynchronous web scraper.

This is the second in a three part series. In the third instalment, we'll be creating our very own async web server hosting a personal blog with all the functionality implemented in async (plus a bonus section on how to properly unit test async code)!

Please use Python 3.74+
Рекомендации по теме
Комментарии
Автор

Thanks sir, I have been checking about this asynchronous way of programming for some days now but couldn't understand, i have reffered many videos from youtube but none of them is clearly explaining your video really helped me in understanding. Your content's really good wishing you all success.

suryajithnair
Автор

I'm wondering if HasData has a Python API that can handle async web scraping, anyone tried it before?

AliRami
Автор

This video was super helpful and helped save my project a lot of time in development!! Thank you :)

hideodaikoku
Автор

That's an insane speed boost. Thanks a lot for these tutorials. They are very helpful. and completely out of the box, perf_counter() was an unexpected bonus, that I'll be using from here on out. Keep up the good work.

Sra
Автор

Write file is locking operation. You should use async library for that.

slava
Автор

I appreciated the breakdown! Just wondering, how does HasData tackle challenges in async scraping?

sterlingberbe
Автор

Thanks for clearing things up! I’m wondering, could HasData be a good tool to use alongside this async approach?

yasharlrstani
Автор

Why not using Aiofile to have the write function async ?

optimiserlenergie
Автор

Why is the write_file function async? It looks like the content of the function are synchronous unless I'm not understanding something?
Would it make a difference if it was just:

def write_file(n, content): # without the async keyword
...

async def scrape_task(n, url):
content = await download_file(url)
write_file(n, content) # without the await keyword

This should work and be the same speed, if not inconsequentially faster relative to network calls.
Anyways let me know if I'm not understanding this correctly, I enjoyed the video :)

andrew_arderne
Автор

The write file function must be execute using: loop.run_in_executor since write in a file is not handled by the io multiplexer

steelday
Автор

Constructing an aiohttp.Clientsession on each call is very expensive.

zanityplays
Автор

So... LN28 is the Promise.all counterpart of JavaScript.

akagent
join shbcf.ru