The 5 Levels Of Text Splitting For Retrieval

Показать описание

Greg’s Info:

Outline:
0:00 - Intro
3:42 - Theory
6:57 - Level 1: Character Split
16:04 - Level 2: Recursive Character Split
20:59 - Level 3: Document Specific Splitting
32:10 - Level 4: Semantic Splitting (With Embeddings)
48:02 - Level 5: Agentic Splitting
1:02:47 - Bonus Level: Alternative Representation

Greg Kamradt (Data Indy)

Рекомендации по теме

Комментарии

Both LangChain and Llama Index have added Semantic Chunking (level 4) to their libraries

DataIndependent

First video I came across that actually explain langchain in detail so that a layman can understand how it actually works

AshWickramasinghe

Why did YouTube take so long to recommend me this channel? Incredible work!

stavroskyriakidis

With the continuous influx of short form content, props to you for making this so interesting to watch. Didn't even realise it was an hour long. Loved every second of it. Thanks!

adityasankhla

Hi Greg, many thanks for the work you put into this and to help all of us learn. Great clarity, depth and tempo! 💪

artislove

Wow! I hadn't even thought about Agentic Chunking! I need to try this. I did some extensive experimentation with chunking on a project at work for a clinical knowledge base and I found that chunking strategies can make the difference between an ok retrieval and an awesome retrieval that works across a higher percentage of queries.

NadaaTaiyab

This is an insanely detailed from first principles tutorial. Thank you for taking the time to put this together.

stonedizzleful

Thanks for this Greg. I've been looking at agentic chunking for a while and this video really helped me with implementation. Not heard of you before I searched but now subbed. Thanks a lot :)

truthwillout

Nice vid, Greg! You're on the cutting edge with some of these splitting techniques. Well done. 😎

andreyseas

I thought the explanation and showing your experimentation for semantic splitting was creative. Thank you very much.

kenchang

man it took me 3 weeks to find you. thank you please keep on coming.

JoanApita

what the ___. how good can a tutorial be. such a gem of a video. thx for making this. new to ml and found this very helpful

drakongames

Amazing!! I am fascinated by how document specific splitting or the bonus level also ties with how we structure our data schema. E.g. extracting metadata like "Introduction" in level 3 or applying a summary to the podcast and indexing that to then link to the raw clip in the bonus level. All amazing, super useful stuff -- I am a bit skeptical on embedding based splitting though, maybe just need to dive in further! Mostly bullish on level 5: agentic splitting with multimodal llms that kind of blend levels 3 and 5

connorshorten

Thanks I was thinking about solving my own Retrieval problem. I already got the small crude proof of concept using just simple chunking, embedding, RAG, etc. Now I need to get bigger user inputs that are in bigger pdf files. I thought using agents for it to get around the context window, you agentic chunker is a good starter and does make intuitive sense. I will try this route.

JunYamog

Hi Greg, thanks for the video. It's awesome to have someone publishing good content who's doing the exact same thing as me. Hope to see more videos on advanced topics like this!

Jonathan-rmkt

you really deserve that like buttons really thanks for this out of the world content

chakerayachi

Liked this semantic splitting! Cool stuff you´ve done there!! Also agentic chunking. Pretty cool!!!

mgqkclf

Thanks greg! Love the long form instructional video :D Greatly appreciated

srikanthganta

Awesome video, thansk so much, its so much informative and clear to follow. Well done.

maria-whkm

Love your videos, especially this one. The information density and presentation is off the charts. It is so altruistic of you to put this out there for free.

I am especially interested in the semantic chunking. One use case is transcripts which often have distinct conversation blocks or qhestion answer pairs. Since it is important to capture the question and answer for full context, i was wonderinf what methodology might work best.

Alternatively, semantically chunking a document vs pre-defined themes - sort of the opposite direction as the agentic chunker. First generate or define the overarching themes or buckets, then assign chunks to them.

It seems that there is some real possibility in the semantic chunking methods. 🎉 Looking forward to experimenting more.

Thank you again.

robxmccarthy

The 5 Levels Of Text Splitting For Retrieval

The 5 Levels Of Text Splitting For Retrieval

5 Levels of Auto Text Effect- HTML/CSS/JS

Spreadsheets Explained In 5 Levels of Difficulty | Basic Text to Apps Script

5 Pro-Level RYOBI Tools You've Never Seen Before (even haters will love these)

Same Russian text in 5 different levels A1, A2, B1, B2, C1. What is your level?

How To Add Text To Level Sequences In Unreal Engine 5

Level 5 voice sound effect with text

Thomas & Friends Magic Trick Tutorial ✨ #Shorts

Kling AI | Create video from text on a new level

5 levels of support Ruby Ribbon Style, this video has no text

Completing Every Daily Level of 2024 - Day 5: Insert Text by Pringlex

Lexia level 11 text structure 1

Three Levels of Text Protocol - Debrief

Can you read the black text level EXTREME

how to finish level 5 but the text is the helper not ME

🤯🫠 Make Sensational Text Portrait Effect: Intermediate-Level #photoshop Tips

Conditional Formatting with Special Text in Excel | Beginners Level | Lesson 5 | TECH-mAdy

Text or Die Level 1

NEVER SAY THIS IN SALES // ANDY ELLIOTT // text GAME to level up 918-210-0254

FIVE Key Tips to MASTERING ANY TEXT (GCSE A-Level and Beyond)

K-5 Literacy: Understanding Guided Reading Text Level Characteristics

Digital Smart A.I. || Text to Speech at High Level

How To Improve SMS Delivery Rate & Text Compliance With High Level

How to Write small Text In HTML / Beginner To Advanced Level / HTML #Shorts