You should use LangChain's Caching!

Показать описание

LangChain provides a caching mechanism for LLMs (large language models). The benefits of caching in your LLM development are:

1. Save you money by reducing the number of API calls you make to the LLM provider (like OpenAI, Cohere etc), if you're often requesting the same completion multiple times
2. Speeds up your application by reducing the number of API calls you make to the LLM provider. We look at how LangChain implements this caching mechanism, and how we can use it in our own LLM development process.

- Watch PART 1 of the LangChain / LLM series:
Build a GPT Q&A on your own data

- Watch PART 2 of the LangChain / LLM series:
LangChain + OpenAI to chat w/ (query) own Database / CSV!

- Watch PART 3 of the LangChain / LLM series
LangChain + HuggingFace's Inference API (no OpenAI credits required!)

- Watch PART 4 of the LangChain / LLM series
Understanding Embeddings in LLMs (ft LlamadIndex + Chroma db)

- Watch PART 5 of the LangChain / LLM series
Query any website with GPT3 and LlamaIndex

- Watch PART 6 of the LangChain / LLM series
Locally-hosted, offline LLM w/LlamaIndex + OPT (open source, instruction-tuning LLM)

- Watch PART 7 of the LangChain / LLM series
Building an AI language tutor: Pinecone + LlamaIndex + GPT-3 + BeautifulSoup

- Watch PART 8 of the LangChain / LLM series
Building a queryable journal 💬 w/ OpenAI, markdown & LlamaIndex 🦙

- Watch PART 9 of the LLM series

- Watch PART 10 of the LLM series
GPT builds entire app from prompt (ft. SMOL Developer)

- Watch Part 11 (Prompt Engineering / Prompt Design)
A language for LLM Prompt Design: Guidance

All the code for the LLM (large language models) series featuring GPT-3, ChatGPT, LangChain, LlamaIndex and more are on my github repository so go and ⭐ star or 🍴 fork it. Happy Coding!

Samuel Chan

Рекомендации по теме

Комментарии

I've learned a lot from your series.

Could you make a video mapping out the different llama indexes, how they work, and their benefits for use cases like:

- Knowledge graphs
- Data filtering
- Search optimization
- Recommendations

Practical examples and tips would be very helpful in maximizing llama's performance for a range of solutions.

Please continue the great work! I look forward to learning more from your future videos.

vigneshpadmanabhan

Vert well done!

How would you implement semantic caching?

I am working on a LLM app were various users interact with one PDF. The users may ask questions that are similar, but will never be exactly the same in terms of wording.

Still the intention of the question is the same. This intention could proxied with semantic similarity on the questions. If the similarity score is more than 90%, we could assume the questions are similar and return the previously cached response.

It’s not perfect. Would you agree?

cag

Great video! Is it possible to apply caching to LangChain agents?

JohnDoe-ns

This langchain series is great. Thank you

arthurperini

I am trying to reduce the time for converting large document to vectors using open ai embedding. Can you post video related to that?

vigneshnagaraj

Is there a way to delete a specific cache only in the database?

princessapellido

How this will work for multiple user scenario?

GaneshPatil-gk

LOVE this, very informative. It's clear of your mastery. Do you have any information on map_reduce w/transformer (non-openai, local models)???

carlbroker

Hey how can we get tpoken usage with streaming = true

pranavsharma

Great I have two questions
1. I have a usecase where I need to query a big like around 100, 000 record of structure data into a sql database may be or a csv file I want to use sql agent or make an ApI call to query the data If I need the summary of the data and I am not able to bypass the token limit size am I suppose to use the sqlite cache on both mapper side and reducer side ? What do you suggest ? How can I do that?

BatBallBites

can you make us a tutorial for source code analysis using LLMs ?

SalemBenabdallah-rzew

You should use LangChain's Caching!

Is This the End of RAG? Anthropic's NEW Prompt Caching

How to save money with Gemini Context Caching

Making Long Context LLMs Usable with Context Caching

What is LangChain? | LangChain Explained in 30 Secs | Intellipaat #Langchain #Shorts

Cost Saving on OpenAI API Calls using LangChain | Implement Caching and Batching in LLM Calls

Chaining in LangChain: The secret to powerful LLM Apps

LangGraph 101: it's better than LangChain

Language Modeling with Redis (and LangChain): From Zero to Hero

Semantic Search Made Easy With LangChain and MongoDB

The Best RAG Technique Yet? Anthropic’s Contextual Retrieval Explained!

How to use LangChain for RAG over audio files

Context Caching with Gemini LLM

Complete @LangChain Essential in 1 shot | LangChain Core | LangServe | LangGraph | LangSmith | Agent

Announcing LlamaIndex Gen AI Playlist- Llamaindex Vs Langchain Framework

Try LangChain with Python and Upstash Vector

Langchain Conversation Buffer Memory vs Conversation Buffer Window Memory | Chat History#ai #llm #yt

Coding chains with Langchain | Hands-on demonstration

Try this Before RAG. This New Approach Could Save You Thousands!

Learn LangChain In 1 Hour With End To End LLM Project With Deployment In Huggingface Spaces

Step-by-Step Setup Guide: Research Paper Q&A Tool with Streamlit, LangChain & LLM

LangChain Crash Course for Beginners

Build Your Own ChatGPT with Redis & LangChain | Custom Knowledge ChatGPT Clone | LangChain Tuto...

Anthropic Claude Prompt Caching Going to be Game Changer

Building an AI Data Assistant with Streamlit, LangChain and OpenAI | Part 1