LangChain: How to Properly Split your Chunks

Показать описание

In this video, we are taking a deep dive into Recursive Character Text Splitter class in Langchain. How you split your chunks/data determines the quality of the answers you get when you are trying to chat with your documents using LLMs. Learn how to properly use text splitter in Langchain.

#llm #langchain #PDFchat
▬▬▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
All Interesting Videos:

Рекомендации по теме

Комментарии

please make more videos like this one! Many people got into AI without coding background, we are missing more detailed videos on these topics!

CacoNonino

Just found your channel and while I initially wanted to have you as a professor in a classroom ( maybe back in college 30yrs ago), I really think you are helping to create a better world for many with your content, careful explanation and examples and this is the true reason and mission for a teacher, congrats!

parisneto

and I think nobody can explain concepts in easier way than you do..
tried 10 different videos for checking how Recursivesplitter would go if para is <chunk size and Para>chunk size.. and you explained it.. :)
love it how you cover each and every aspects from learning point of view.. Thanks again. .

deepaksingh

Thank you, you explain very clearly and I have been watching your content. They really good and honest. Please keep these types of videos.. thanks a lot.

asithakoralage

Great Work! Very simple but really elaborative. Please create more videos in this for this series

adnanrizve

First time I see content on the optimal chunk lengths. In addition it might be interesting on how to integrate metadata as for example on which page of a book, which url or which paragraph in a law text a text comes from or is within a text. These metadata also will take space in the retrieval context.

Good work. Definitely go this road.

RealEstateD

Incredible ! Hope you'll provide more videos like this one !

wassimsaioudi

Great Video, Thanks for creating the video!

darshan

I’d love to see videos on both embedding size and modifying the text splitter! I’m particularly interested in strategies that would enable inclusion of citations, e.g. a medical article that includes numbered citations at the end of each sentence with the reference list at the end of the document.

WinstonWalker-fcty

Finally understood this. I remember asking on discord and I think you also replied but the fact an entire video was made on this made it muc much much clearer. Thank you so much!

Could you make a video about vectorstores and which one to use, how to know what to use, and the code behind it because I saw a couple like FAISS, chromaDB, deeplake etc... and for my chatbot, it's pretty much the last thing I have left to do but I still don't understand pretty much most of vectorstores for now.

yazanrisheh

Great Video, Thanks for creating the video!😀

ipyqtzr

Great explanation, thanks, this will be super useful!

SmashPhysical

Please keep making more such videos. I found this video very helpful..

SachinChavan

Appreciate all your content. I'd love to know more about chunking customization. Thanks! 🤙

e_hana_kakou

Great Video to understand chunks and textsplitter

izainonline

Good video - for the dataset I am working with I found that spliting by tokens produces better results but really depends on the data you're working with tbh!

hvbris_

Very nice video, I think anyone working on semantic search goes through the experience you described here. Have you seen a study that checks the performance of different embeddings with respect to the chunk size?
Also, what are the different available models for embeddings? I have been using the faiss models, I have heard you mention another one. What would be a good strategy to pick one vs. another?

gerardorosiles

Thanks for the video! What if you want to chunk a large PDF of 300 pages? How do you determine the chunk size? I mean, in your example you can observe the length of each paragraph by observation but might be hard to do it for large file. I would appreciate it if you share your opinion.

Ken

Damn you explained that better in 3 mins that most other videos did in 30 mins

TheCloudShepherd

Please do create one for custom splitting. I have a particular document where I would like to define a chunk demarcated by special sequence.

nirsarkar

LangChain: How to Properly Split your Chunks

LangChain: How to Properly Split your Chunks

The 5 Levels Of Text Splitting For Retrieval

LangChain Data Loaders, Tokenizers, Chunking, and Datasets - Data Prep 101

Summarizing and Querying Multiple Papers with LangChain

How to Determine Optimal Chunk Size for LLM

Building a Summarization System with LangChain and GPT-3 - Part 1

Information Extraction with LangChain & Kor

LangChain + Ray Tutorial: How to Generate Embeddings For 33,000 Pages in Under 4 Minutes

Building LLM Applications with LangChain - Part 2: Chat and QA with Google Gemini Pro

How to use Metadata Filter on LangChain + Pinecone (No-code via FlowiseAI)

LangChain - Using Hugging Face Models locally (code walkthrough)

LangChain Version 0.1 Explained | New Features & Changes

Learn How To Query Pdf using Langchain Open AI in 5 min

Summarization Crash Course with LangChain

4 ways to do question answering in LangChain | chat with long PDF docs | BEST method

Different Text Summarization Techniques Using Langchain #generativeai

Extract Topics From Video/Audio With LLMs (Topic Modeling w/ LangChain)

Multi-Vector Retriever for RAG on Tables + Texts Using LANGCHAIN & UNSTRUCTURED

Learn LangChain.js - Build LLM apps with JavaScript and OpenAI

LangChain Explained In 15 Minutes - A MUST Learn For Python Programmers

How to Create LOCAL Chatbots with GPT4All and LangChain [Full Guide]

Text-Splitter & Embedding In Large Language Model Demistyfied:Langchain Document Embedding Made ...

Langchain Document Loaders Part 1: Unstructured Files

LangChain Crash Course For Beginners | LangChain Tutorial