Semi-structured RAG - LangChain using Mistral 7B , Qdrant FastEmbed on pdf text with tabular data

Показать описание

Many documents contain a mixture of content types, including text and tables.
Semi-structured data can be challenging for conventional RAG for at least two reasons:
• Text splitting may break up tables, corrupting the data in retrieval
• Embedding tables may pose challenges for semantic similarity search
This video shows how to perform RAG on documents with semi-structured data:
• We will use Unstructured to parse both text and tables from documents (PDFs).
• We will use the multi-vector retriever to store raw tables, text along with table summaries better suited for retrieval.
• We will use LCEL to implement the chains used.
We will use Mistral 7B Instruct as our LLM and use Qdrant FastEmbed for our embedding
Colab notebook:

If you like such content please subscribe to the channel here:

Rithesh Sreenivasan

Рекомендации по теме

Комментарии

hi sir, can i do this same in amazon sagemaker, or in amazon bedrcok

techthunder

Can you go in detail how extracted text and table looks like? especially table after extracting and before making summaries of table.

Thanks

sagarchadha

Sir, Is this done on paid colab? How can I do this in unpaid colab with cpu? Is it even possible?

rnronie

Table, Text Can we add images data too here?

devanshgupta

Semi-structured RAG - LangChain using Mistral 7B , Qdrant FastEmbed on pdf text with tabular data

Semi-structured RAG with LangChain and OpenAI GPT-4 RAG on tabular data , semi structured documents

Semi-structured RAG - LangChain using Mistral 7B , Qdrant FastEmbed on pdf text with tabular data

Benchmarking Methods for Semi-Structured RAG

Multi-Vector Retriever for RAG on Tables + Texts Using LANGCHAIN & UNSTRUCTURED

RAG from scratch: Part 12 (Multi-Representation Indexing)

Multi-modal RAG With LANGCHAIN 🦜🔗 & GPT-4V

Multimodal RAG with GPT-4-Vision and LangChain | Retrieval with Images, Tables and Text

RAG for long context LLMs

Fine-Tuning Enterprise RAG Knowledge Bases with Label Studio, ChatGPT, and Ragas

Realtime Multimodal RAG Usecase Part 1 | Extract Image,Table,Text from Documents #rag #multimodal

5-Langchain Series-Advanced RAG Q&A Chatbot With Chain And Retrievers Using Langchain

Building a Multimodal RAG App for Medical Applications

ADVANCED Python AI Agent Tutorial - Using RAG

Extract Tables + Texts from .htm pages for RAG Using LLAMA-INDEX & UNSTRUCTURED

Building Production-Ready RAG Applications: Jerry Liu

Loading PDF Data Into Langchain : To Use Or Not To Use Unstructured Library

LangChain Crash Course for Beginners

LangChain v/s Llama-Index | Detailed Differences | Which one you should use?

OpenAI Embeddings and Vector Databases Crash Course

Advanced RAG 02 - Parent Document Retriever

Chunk large complex PDFs to summarize using LLM

Advanced RAG with Knowledge Graphs (Neo4J demo)

Building adaptive RAG from scratch with Command-R

LangChain is AMAZING | Quick Python Tutorial