LangChain: How to Properly Split your Chunks

preview_player
Показать описание
In this video, we are taking a deep dive into Recursive Character Text Splitter class in Langchain. How you split your chunks/data determines the quality of the answers you get when you are trying to chat with your documents using LLMs. Learn how to properly use text splitter in Langchain.

#llm #langchain #PDFchat
▬▬▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
All Interesting Videos:

Рекомендации по теме
Комментарии
Автор

please make more videos like this one! Many people got into AI without coding background, we are missing more detailed videos on these topics!

CacoNonino
Автор

Just found your channel and while I initially wanted to have you as a professor in a classroom ( maybe back in college 30yrs ago), I really think you are helping to create a better world for many with your content, careful explanation and examples and this is the true reason and mission for a teacher, congrats!

parisneto
Автор

and I think nobody can explain concepts in easier way than you do..
tried 10 different videos for checking how Recursivesplitter would go if para is <chunk size and Para>chunk size.. and you explained it.. :)
love it how you cover each and every aspects from learning point of view.. Thanks again. .

deepaksingh
Автор

Thank you, you explain very clearly and I have been watching your content. They really good and honest. Please keep these types of videos.. thanks a lot.

asithakoralage
Автор

Great Work! Very simple but really elaborative. Please create more videos in this for this series

adnanrizve
Автор

First time I see content on the optimal chunk lengths. In addition it might be interesting on how to integrate metadata as for example on which page of a book, which url or which paragraph in a law text a text comes from or is within a text. These metadata also will take space in the retrieval context.

Good work. Definitely go this road.

RealEstateD
Автор

Incredible ! Hope you'll provide more videos like this one !

wassimsaioudi
Автор

Great Video, Thanks for creating the video!

darshan
Автор

I’d love to see videos on both embedding size and modifying the text splitter! I’m particularly interested in strategies that would enable inclusion of citations, e.g. a medical article that includes numbered citations at the end of each sentence with the reference list at the end of the document.

WinstonWalker-fcty
Автор

Finally understood this. I remember asking on discord and I think you also replied but the fact an entire video was made on this made it muc much much clearer. Thank you so much!

Could you make a video about vectorstores and which one to use, how to know what to use, and the code behind it because I saw a couple like FAISS, chromaDB, deeplake etc... and for my chatbot, it's pretty much the last thing I have left to do but I still don't understand pretty much most of vectorstores for now.

yazanrisheh
Автор

Great Video, Thanks for creating the video!😀

ipyqtzr
Автор

Great explanation, thanks, this will be super useful!

SmashPhysical
Автор

Please keep making more such videos. I found this video very helpful..

SachinChavan
Автор

Appreciate all your content. I'd love to know more about chunking customization. Thanks! 🤙

e_hana_kakou
Автор

Great Video to understand chunks and textsplitter

izainonline
Автор

Good video - for the dataset I am working with I found that spliting by tokens produces better results but really depends on the data you're working with tbh!

hvbris_
Автор

Very nice video, I think anyone working on semantic search goes through the experience you described here. Have you seen a study that checks the performance of different embeddings with respect to the chunk size?
Also, what are the different available models for embeddings? I have been using the faiss models, I have heard you mention another one. What would be a good strategy to pick one vs. another?

gerardorosiles
Автор

Thanks for the video! What if you want to chunk a large PDF of 300 pages? How do you determine the chunk size? I mean, in your example you can observe the length of each paragraph by observation but might be hard to do it for large file. I would appreciate it if you share your opinion.

Ken
Автор

Damn you explained that better in 3 mins that most other videos did in 30 mins

TheCloudShepherd
Автор

Please do create one for custom splitting. I have a particular document where I would like to define a chunk demarcated by special sequence.

nirsarkar