Web Scraping with GPT-4 Vision AI + Puppeteer is Mind-Blowingly EASY!

Показать описание

👉 Discord: all my courses have a private Discord where I actively participate

⏱️ Timestamps:
00:00 Intro
00:48 ChatGPT for HTML
02:01 OpenAI API
04:08 Puppeteer
04:28 Avoid restrictions (Bright Data)
05:11 Get HTML with Puppeteer + Proxy
08:35 Processing HTML
10:00 Give HTML to OpenAI
12:19 ChatGPT for scraper code
14:41 Vision API
19:40 Vision API pricing
21:12 Vision vs Text pricing
22:52 Future of web scraping (AI + Bright Data)

#webdevelopment #programming #coding #reactjs #nextjs

ByteGrad

Рекомендации по теме

Комментарии

🎯 Key Takeaways for quick navigation:

00:00 *🌐 Web scraping has been revolutionized by AI, particularly with the latest Vision AI model, making data extraction more efficient.*
01:07 *💻 Manually copying HTML and using Chat GPT for extraction is one method, but OpenAI's API offers programmable solutions for scalability.*
02:16 *🔄 Using Puppeteer with Bright Data's scraping browser helps circumvent website restrictions and rate limiting during scraping.*
05:33 *🖥️ Puppeteer allows for easy scraping of HTML content, but there's a need to manage and clean up the extracted data before analysis.*
08:35 *💡 Extracting only necessary data from HTML can optimize costs when using OpenAI's models for analysis.*
12:17 *💰 Text-based scraping methods can be cost-effective, but they require ongoing maintenance due to HTML structure changes.*
14:49 *📸 Utilizing OpenAI's GPT-4 Vision API enables data extraction from screenshots, potentially offering a more robust solution for complex web scraping tasks.*
17:52 *🖼️ Using base64 encoding allows passing images to models, enhancing data processing capabilities.*
18:49 *💸 Consider cost-effectiveness when choosing between complex HTML-based or text-based approaches for web scraping.*
19:58 *🎚️ Adjusting image resolution can significantly decrease token usage in web scraping, but it may increase the likelihood of errors.*
20:53 *🖼️🔄 Balance image resolution and price when utilizing Vision API for web scraping, as higher resolution images incur higher costs.*
21:19 *🧹 Clean up HTML before web scraping to reduce token usage and ensure accuracy in results.*
22:57 *🤖 Explore advanced features of AI tools, such as identifying clickable elements, to enhance web scraping automation.*

Made with HARPA AI

hxxzxtf

This is such a timely video — i'm doing something similar to resurrect a website from the wayback machine.

zeeeeeman

Wow. this video provides GREAT value. Just in time for what I´m doing now. Thanks mate!

beemerrox

Thank you infinitely for sharing this masterclass lesson with the universe for free. Subbed

SupCortez

what an amazing video - like its so niche but so useful

reidevanson

Hey man, mind if I ask what programming languages you know other than Javascript/TS ?

hellokevin_

It's interesting, but what if I want pagination?
I will still need to select next button in old way.
Is there any other way of doing the pagination?

imranhrafi

Great video. Some question though. What about hallucinating? How can be sure is not doing it?

felipeblin

Can you create a video how to deploy puppeteer and next js to vercel?

juliushernandez

Have you thought about or tried using a local model to scrape, it would save all the costs

RobShocks

Where can I learn basic coding from scratch to be able to do that?

dupatrio

and how do you get to the next page to scrape?

gregsLyrics

How to do this using Braina AI? Braina can run GPT-4 Vision.

LifeTrekchannel

This is a great video. But the problem with scraping has hardly ever been parsing the HTML or maintaining the parsers.

The biggest problem is efficiently accessing websites that actively try to block you by gating their content being a login or captchas. Then comes IP blocking (or worse data obfuscation) if you Scrape their website in a large volume.

Lars

I am interested in creating a price comparison website featuring approximately 10-20 shops, each offering around 10, 000 similar products. Unfortunately, these shops do not provide APIs for direct access to their data. What would be the most efficient approach to setting up such a website while keeping maintenance costs reasonable?

amadeuszg

I am scrapping (dropping html) with python code with selenium (aprrox 60, 000 articles) and later creating vector embeddings for Llama 3 and asking it to write article for me.

amitjangra

Web Scraping with GPT-4 Vision AI + Puppeteer is Mind-Blowingly EASY!

WEB SCRAPPING Using CHATGPT | How To Use GPT 4 Vision API To Automate Web Scrapping | Simplilearn

Vision-based Web Scraping with the New GPT-4o model

Web Scraping with GPT-4 Vision AI + Puppeteer is Mind-Blowingly EASY!

GPT-4 Vision API + Puppeteer = Easy Web Scraping

GPT4V + Puppeteer = AI agent browse web like human? 🤖

Will AI Kill Traditional Web Scraping? (GPT4V + Mistral Medium Project)

Automating Web Scraping with GPT-4

Vision-based Web Scraping with the New GPT-4o model in Make.com

Scrape Anything with GPT-4o (Vision Based Scraping)

5 Use Cases for GPT-4 Vision API (and DALL-E 3)

Industrial-scale Web Scraping with AI & Proxy Networks

ChatGPT Plugin Review: Scraper AI. Grab information from webpages in seconds.

Scrape ANY Website for FREE USING GPT-4o Vision | LLM Project

GPT-4 Vision API+Puppeteer=Easy Web Scraping#airevolution #technology #artificialintelligence #ai

How To Use ChatGPT To Fully Automate Web Scraping

Scrape Any Website with GPT4, Here's How

Web Scraping with ChatGPT is mind blowing 🤯

Scrape ANY Website with AI For Free | Best AI Tools

NEW Use GPT-4o To Scrape Websites & Get More Clients!

New Tutorial: extract info from any website using no-code and Open AI's Vision. #nocode #scrapi...

How To Install LLaVA 👀 Open-Source and FREE 'ChatGPT Vision'

Scrape any website with OpenAI Functions & LangChain

Vision Based Web Scraping with GPT 4o | Make.com | Scrapingbee | AI Automation

GPT-4 Vision + Zapier + MindStudio (INSANE Automations)