Web Scraping with GO... Easy AND Fast?!

preview_player
Показать описание
This has to be the easiest way I've ever implemented ASync web scraping, and its super fast. Go and Colly have been great for pure HTML web scraping, allowing easy crawling of pages, data gathering and providing a generally good experience overall. I know its not Python but I think its well worth checking out.

Disclaimer: These are affiliate links and as an Amazon Associate I earn from qualifying purchases

# timestamps

00:00 Intro
00:30 Scraper Code
10:20 Refactor & Async
Рекомендации по теме
Комментарии
Автор

Excellent one. I might start using Go now

aflous
Автор

Class! Thanks for this vid! The parser on Go looks clearly faster than on Python, and if you still use a goroutine, then the difference in speed will probably be even greater. While Python can also use multi-process parsing.

alexpresley
Автор

Now I need to search for python vs go benchmarks for scraping... Hehe

hrvojematosevic
Автор

and you could try Rust for web scraping as well

TheJFMR
Автор

Great video. We use Go in the software engineering team at my bank. Still early days but its getting traction. John have you used Rust yet?

skillswithsid
Автор

Hi John, loving your videos. Just came across your channel yesterday. I've been scraping for about 20 years using a mixture of php/curl then python using string manipulation, so it's great to see how to do it more efficiently! Do you have a video rounding up what ide and tools you use? Nothing jumped out. Currently using visual studio but there may be something better?

sfraser
Автор

Great video! I was wondering if you can make a golang scrapper which first logs in to a web page and afterwords scrapping the webpage !

geontral
Автор

Thanks for the video! Could you please show us how can scrap data from multiple website with a single go script?? [selector will be different for each website]

proMehediBD
Автор

Very good - I went from 40s to 8s on my site. Very impressive. Non Async to me would be too slow. One general issue i have is how to merge information for the same product when that info is spread about in different sections of the page? On a single page, how to effectively match up what belongs to what. I know one way would be to find the common parent class and use one h *colly.HTMLElement block but maybe there is another way with the structs?

valuetraveler
Автор

Can you help setting up neovim as yours specially that popup window for running process

namanshahi
Автор

Can you show us how do you Ajax api scrapping with golang ?

alexandreprince
Автор

Does anything have to be installed/set-up in advance?

DM-pypj
Автор

Sir please help me i need some help, can you make a video how to login a css website through python ...
Import requests
From bs4 import beautifulsoup
help me sir

XENOCYTES
Автор

I love your content please don’t start with the ridiculous clickbait stupid face thumbnail games.

StupidInternetPeople
Автор

Hello John, Great video! Is Colly faster than asyncio + aioHttp?

papipapajohn