Python Web Crawler Tutorial - 17 - Running the Final Program

preview_player
Показать описание
Рекомендации по теме
Комментарии
Автор

This is gold man! Thanks for the awesome tutorial. The amount of details given in tutorials were just right.
I'm sorry to see your web page went down. But I hope you keep the good content flowing :) Cheers!

rushas
Автор

You're one of the better teachers. Clear and straight to the point. You've just got a new suscriber

jordansmith
Автор

Thank you a lot man, seriously, love you, this is diamond in a jungle

manuelespinoza
Автор

Excellent series. Well worth watching. Thank you for doing this! :)

mattotoole
Автор

Great series. A nice little addition would be to exclude the anchor portion of links (if a URL has one).

PeteMoxon
Автор

THANKS man! Loads of Love from India <3

aishwaryakumar
Автор

Thanks so much! favorite youtube tutorial so far by far :D

<3

jimmysoonius
Автор

Great videos! Keep it up! They have really helped me learn Python!

MattCamp
Автор

Im trying to limit the extent of the webcrawler by using the following line in the crawl method:
if len(queued_pages) > 0 and len(crawled_pages) < 1000:
However, it looks like this method is only called once in a while (via checking with print statements) when the crawler runs. Why is this called so rarely and how can I limit the pages checked

philh
Автор

This is a great tutorial! Thanks! I have one question. I tried crawling a wordpress site. Does this only work with HTML pages? I seem to remember you saying it would work with PHP, but I'm not sure.

jeremyc
Автор

hey man thanks, you just saved my life!

nora_osipova
Автор

Really nice serie of video and really useful. Thank you!!!

kijoupa
Автор

Why does the work function body is wrapped in a while true?

mmxhhqz
Автор

when crawling a website in google, getting an error like Error: can not crawl page please help me to how to solve this

dineshd
Автор

hi bucky, I'm wondering, I think Google is not using txt files to save their crawled pages so I want to know, will it be efficient if I'll try to modify your project and change it to an sqlite database?

lowkeygaming
Автор

Saving this data is great but how to save crawl data in database and organize it?

MrBesharam
Автор

Mine only works on the Wikepedia homepage. On every other website it crawls the homepage and stops.

lukeshuttleworth
Автор

i am able to crawl just fine up to about 3510 pages, then it just freezes with over 1000 pages in the queue. why is this happening? is the connection closing?

codyspate
Автор

For some reason it's throwing an exception: descriptor 'add' requires a 'set' object but received a 'str'

And that's all it does it creates the directory and files but leaves them blank unfortunately.

roymiller
Автор

Okay but what if site runs on java? Then it does not find an other link than just domain name

kamil