The Biggest Issues I've Faced Web Scraping (and how to fix them)

preview_player
Показать описание

0:00 Problems I face web scraping
1:03 Web Scraping Basics Overview
4:38 Handling Complex Web Technologies
6:24 Script Optimization + Error Handling + Adaptive Algorithms
8:23 AI-Driven Proxy Management, Anonymity, and Intelligent Rate Limiting
10:23 How to Handle Extracted Data
12:22 Ethical AI and Legal Compliance
14:15 Thanks for Watching!

Don't know why you'd want to follow me on other socials. I don't even post. But here you go.
Рекомендации по теме
Комментарии
Автор

In my opinion as i developed multiple web scraping application, half of the time is not spent coding but instead trying to reverse engineer the web application. Simple ones are just matter of looking at requests from dev tools and manually make api calls, while most complicated ones involve backtracing how content is loaded on the page to find the js code responsable to do that. Basically its 70% reverse engineering and 30% coding, if you do things the smart way.

PaoloAnzani
Автор

interesting timing to see this video, literally the day after I completed my first full-stack application which literally revolves around web-scraping :D

delsix
Автор

I used to web scrape all the time, but stupid js frameworks obsfucated css class names has made it very difficutlt.

Dalamain
Автор

Thank you for the amazing video! Much appreciated as a young web developer. By the way, none of the buttons lit up or did any animations... I am a subscriber, so I don't know if that's why.
Peace!!!

redbill
Автор

Yeah. Scraping a dynamic website really makes me want to scream like Linus Torvalds to NVIDIA. And I also hate CloudFlare 😂

yafethtb
Автор

I remember starting to watch your videos when I was entering computer science Ba, and as a 28 year old 1 semester left to graduate, you’re still uploading good content that’s unique. Never get tired of your vids, keep it up brother . I’m also concerned with the job market, can you make a vid about new grad Cs students ? For example seems almost every job wants front end or something and my school never taught any of it

xlafxx
Автор

Hi Forrest. I was wondering how you still feel about AI and the future of software engineering. With chat GPT out for over a year now, have your views changed much? Maybe a good topic for another vid.

Cryogenics
Автор

AFAIK the button highlighting is a feature based on video subtitles, including those generated automatically, but still somewhat random. I didn't catch those because I was already subscribed and like the video a moment before you said it.

EduardoEscarez
Автор

I am working on building a project that heavily requires scraping so I been doing a lot of research. And its really hard to find anything good that is not sponsored by brightdata. I get it, their marketing team has done a great job with tapping a perfect niche of creators who provide valuable information but this also creates a problem to ending up finding that almost each good resource is related to using brightdata and its not something I want to pay for when starting a hobby project.

Anyway, this is a great video either way. I learned a lot of things I hadn't considered in my planning. Like the ETL(thats a new rabbit hole I need to dive into) or adaptive content extraction to account of layout changes. I was just assuming I will set up reporting to notify me when I start getting no content and then I will fix it.
So thank you for that.

Do you setup redis or something to make sure some requests are accessed from the cache of recently requested data than scraping again or accessing the db? is that necessary?
And at what point should a webhook be setup and for what purpose exactly?

Thank you

vd
Автор

I really like the way you explain things and also the pronunciation issues

olhodetamarutaca
Автор

When I see brightdata sponsorship, I instantly stop watching. Paying to brightdata is not a webscraping skill.

ramelox
Автор

To be honest, i subscribed because the button lit up. Also, I love your content.

danielabraham
Автор

dude is literally gilfoyle from silicon valley(love your vids)

Vrrow
Автор

This guy gets it—I’ve been there. I can’t wait to make this all an easy ass python plugin

xdcountry
Автор

The subscribe button didn't light up because I was already subscribed 👍

nrgstudios
Автор

Is web scraping under data science or software engineering structure?

javancheongyujing
Автор

Can you recommend a course to learn web scraping. A course that taught the tool and techniques you mentioned and other concepts

olasunkanmioyetunji
Автор

what are the best ai scraping apps : suggestion/recommendations? Just looking for how our nonprofit organization is aligned with other organizations within a county of california in order to partner with them

manumartinezkcxu
Автор

Is there a reason/advantage to using Bright Data's "scraping browser" product instead of integrating their proxy and IP rotation services into a script I'm running on my own server?

brianmorin
Автор

this video is what I need. But whoaa so fast changes of screens with code... I'm too old at 35 to be able to push the pause button so fast 😅 Do you have some links with those hacks?

dmytro-skh