Turn ANY Website into LLM Knowledge in SECONDS

preview_player
Показать описание
One of the biggest challenges we face with LLMs is their knowledge is too general and limited for anything new. That’s why RAG is such a huge topic when it comes to AI right now - it’s a method for providing an LLM with external knowledge you curate so it can become an expert at something it wasn’t before - a specific AI framework, your ecommerce store, you name it. The problem is, that “curate” step can be very difficult and slow.

That is where Crawl4AI comes in! Crawl4AI is an open source web crawling framework specifically designed for scraping websites and formatting the output in the BEST possible way for an LLM to understand. The best part is it solves a LOT of problems we typically have with systems that crawl websites - usually they are slow, resource intensive, and complicated. But Crawl4AI is VERY fast, intuitive, easy to set up, and extremely memory efficient.

In this video, I show you how to use Crawl4AI to super easily crawl websites for LLMs in just seconds, and at the end I even show you a RAG AI agent I’ve built to be a “Pydantic AI” framework expert using Crawl4AI to build the knowledgebase. And you could really take this and use it for any website you want. Next video I'll do a deep dive into this agent!

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Register now for the oTTomator AI Agent Hackathon with a $6,000 prize pool!

All code for this Crawl4AI RAG Agent can be found here:

Crawl4AI GitHub:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

00:00 - The Beauty of Crawl4AI
02:16 - Why Crawl4AI?
05:25 - Basic Crawl4AI Example - Single Page Crawl
06:56 - Crawling Multiple Pages
08:58 - Ethics of Web Scraping
10:01 - Crawling Multiple Pages Continued
12:24 - FAST Parallel Page Crawling
15:19 - Crawl4AI RAG AI Agent
17:48 - Outro

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Join me as I push the limits of what is possible with AI. I'll be uploading videos at least two times a week - Sundays and Wednesdays at 7:00 PM CDT! Sundays and Wednesdays are for everything AI, focusing on providing insane and practical educational value. I will also post sometimes on Fridays at 7:00 PM CDT - specifically for platform showcases - sometimes sponsored, always creative in approach!
Рекомендации по теме
Комментарии
Автор

Check out the next video where I turn this Crawl4AI implementation into a full RAG AI Agent!

ColeMedin
Автор

This kind of content is exactly why I subscribe to your channel! You keep me up to date on how to stay efficient!

marckeelingiv
Автор

This is literally the project I was trying to dev. Glad to see there's a ready made solution and I can move on the next step. 😊

vincentmayer
Автор

i'm in awe at the amount of information and value this video has. Cole is an absolute legend.

zakariaabderrahmanesadelao
Автор

Over the last 2 years I learned to despise AI channels with clickbait "shocked" videos and lots of fluff. With that said, I was very pleasantly surprised with the quality of this video & the valuable and informative content there. Thanks a lot, Cole :)

krzemian
Автор

Wow, talk about timely. I think I can use this concept to build an expert for a new software system we are going to implement, and I hear their training material is not very good. Thanks for putting this out there for all of us to benefit from.

kenchang
Автор

Can we get a video about CAG(Cache Augmented Generation)? a lot of people are saying it is 10-20x fast and efficient than RAG. Would love to see video on this trending topic☺

omarnahdi
Автор

Dude, this was a really amazing tutorial that I didn't fully realize I needed but did. Bravo

gearscodeandfire
Автор

Think you have what it takes to build an amazing AI agent?
Absolutely yes. Most helpful tool need for my project

srikaramekorama
Автор

Outstanding work, Cole! The way you broke down RAG, CrewAI, and the parallel processing setup was incredibly clear and practical. I especially appreciate how you tied it all together for practical use case and handed to us wrapped up with a bow on it in GitHub. Thanks for what you do for us all. Can't wait for the next deep dive!"

ifnotnowwhen
Автор

Thanks a million: this is eacly what I'm looking for. 100% love your content. Still learning programming and how to use AI: your channel is a massive inspiration

jevadeka
Автор

Amazing !! Excited for next video about how u curated and made agent.

leisureclub_
Автор

dude, I just spent three days creating my custom scraper … now I might shift to this

michael_gaio
Автор

Love crawl4ai and had a script to help give econtext to my llm when coding, from the CLI to scrape one page of a url to save to a new knowledge folder. You take this from 1 to 10x... going to be changing my script to use this!! Thanks to you and Unclecode!

IdPreferNot
Автор

this is awesome, i have a side project in the works this is going to speed up 100x. Not to mention some larger ideas i've been developing. Just subscribed!

aibarra
Автор

This is wow and awesome ❤. Was waiting for this video for a long time.

algotrade
Автор

You are the best teacher who selflessly teach people gold stuff

haidersyed
Автор

Wow this is perfect. I've been trying to the Esphome api docs into my context window by scraping or compiling the dOxygen docs into a pdf. This is much better.

What a high quality video

thespencerowen
Автор

I was expecting to see a readme markdown created by your crawler. Nice information as ever. Thank you

patoescl
Автор

Insights By "YouSum Live"

00:00:00 Large language models need external knowledge integration
00:01:20 Crawl for AI simplifies web scraping for LLMs
00:01:41 Crawl for AI is open-source and efficient
00:03:30 Markdown format improves LLM comprehension
00:08:01 Sitemaps help extract multiple URLs efficiently
00:09:01 Ethics of web scraping are crucial to consider
00:12:54 Parallel processing enhances scraping speed significantly
00:15:00 Crawl for AI enables fast knowledge base creation
00:16:00 RAG agents can be built using scraped data
00:18:07 Crawl for AI is a game changer for AI agents

Insights By "YouSum Live"

ReflectionOcean