Advanced Web Scraping with Puppeteer: Avoid Looking Like a Bot and Pass Authentication!

preview_player
Показать описание
In this video, we're going to take a look at two puppeteer improvements. First, how can you appear as if you were not a robot? That can be very helpful for avoiding bot protection or captchas. Secondly, how do we get through the authentication of a website? Let's dive right in!

Thanks for watching, I wish you lots of fun implementing these puppeteer tips into your own projects! Remember, some companies do not allow scraping their website, so I advise just scraping your own.. :^)
Рекомендации по теме
Комментарии
Автор

Each video adds something "advanced". Let's continue. Thank you.

fitterboss
Автор

That’s so interesting. I didn’t even know we could have this report as an image. We’ll I think that I’ll spend my weekend working on my bot - however how to host them? Do you have a raspberry pi at home or do you use a regular host online?

codewithguillaume
Автор

“well… we look like a bot. maybe because we are a bot” 🤣

legend. great video

righttiming
Автор

I have a question, instead of manual passing authentication, why can't I just login manually and then pass the cookie into the script. Is that harmful or something?

thomasdinhk
Автор

i am passing html as string to it and making pdfs, but images are not getting load, but same thing works in nodejs

mohitpunia
Автор

Would this still work in 2024? Or have big companies came up with the 'defence' already?

mihaelacostea
Автор

how to bypass different types of captchas, please make a video on it.

ahmadfraz
Автор

Thx Kevin. Just wondering if one can use the same code with puppeteer-core

jameskayihura
Автор

After 2 or 3 requests amazon fails.
Tested the modifying to the plugin and stealth in de video, and still failing the same amount.
Gonna have to learn and test with Crawlee.

rodrigodanielss
Автор

Have you tried doing the same on ebay and try log in? They still detect even if you use stealth!?

henriquematias
Автор

Hi Josh just wondering how you used cjs modules along with es6 modules, cos i can't seem to make it work

eternl_sunshine
Автор

bro, I actually found out that u can set headless to false in the launch options and it works

makhmudjonjamoldinov
Автор

How convert multiple script Node.js & Puppeteer to one file?

AnoSkinz
Автор

Thank you so much! This helped me out on a very important project.

sebastianruiz
Автор

If you do npm install now, you no longer need to add executablePath to your code.

kelvin
Автор

What's the ultimate solution for resolving captcha?

Leofmoura
Автор

How do we solve captcha with puppeteer KDB?

tushswe
Автор

So i guess if the login required to use gmail to login, it wouldn't work because the browser that is opened doesnt seem to allow gmail login api

JustinK
Автор

do you know any similar plugins for python

Reaaa
Автор

Looks like waitforTimeout will soon be deprecated. Is it a way to enforce headless true?🤔

splenwilz