Retrieval Augmented Generation in the Wild: Anton Troynikov

preview_player
Показать описание
In the last few months, we've seen an explosion of the use of retrieval in the context of AI. Document question answering, autonomous agents, and more use embeddings-based retrieval systems in a variety of ways. This talk will cover what we've learned building for these applications, the challenges developers face, and the future of retrieval in the context of AI.

About Anton Troynikov
Anton is the co-founder of Chroma. He does not believe AI will kill us all. Chroma build an open-source embeddings store, specifically built for AI-native applications.
Рекомендации по теме
Комментарии
Автор

FYI there is a failure of direct retrieval with GPT-4 using the new OpenAI Assistant API. GPT tokenizes text and creates its own vector embeddings based on its specific training data. The new terms and sequences may not connect well to the pretrained knowledge in GPT's weight tensors.
There was no semantic similarity between the new API terms and GPT's existing vector space. This is a fundamental issue with retrieval augmentation systems like Rag - external knowledge is not truly integrated into the model's learned weights. Adding more vector stores cannot solve this core problem.
The solution is to have multiple learned "knowledge planes" with trained weight tensors for specific tasks that can be switched in. This is better than just retrieving separate vector representations.

Pure_Science_and_Technology
Автор

Excellent presentation. I have found vanilla embeddings insufficient to do “level2” tasks, which require multiple pieces of context that may vary from ultra specific, to rolled up across the entire document. If anyone can link research on how to embed temporal meaning within chronological text, would love to take a look!

Jaybearno