Boosting Databases and Language Models

preview_player
Показать описание
Sam discusses the capabilities of Redis and Redis Search, emphasizing the power of secondary indexing on JSON documents and various integrations. They also highlight the relevance AI integration, which provides a user interface for vector databases. However, the most exciting development mentioned is the upcoming DPU index in collaboration with NVIDIA. Sam addresses the relevance of databases in relation to large language models and emphasizes the limitations of these models when it comes to personalized, confidential, and rapidly changing information.

This LLMs in Production Conference section is proudly sponsored by Redis.

// Abstract
Generative models such as ChatGPT have changed many product roadmaps. Interfaces and user experience can now be re-imagined and often drastically simpified to what resembles a google search bar where the input is natural language. However, some models remain behind APIs without the ability to re-train on contextually appropriate data. Even in the case where the model weights are publically available, re-training or fine-tuning is often expensive, requires expertise and is ill-suited to problem domains with constant updates. How then can such APIs be used when the data needed to generate an accurate output was not present in the training set because it is consistently changing? Vector embeddings represent the impression a model has of some, likely unstructured, data. When combined with a vector database or search algorithm, embeddings can be used to retrieve information that provides context for a generative model. Such embeddings, linked to specific information, can be updated in real-time providing generative models with a continually up-to-date, external body of knowledge. Suppose you wanted to make a product that could answer questions about internal company documentation as an onboarding tool for new employees. For large enterprises especially, re-training model this ever-changing body of knowledge would be untenable in terms of a cost to benefit ratio. Instead, using a vector database to retrieve context for prompts allows for point-in-time correctness of generated output. This also prevents model "hallucinations" as models can be instructed provide no answer when the vector search returns results below some confidence threshold. In this talk we will demonstrate the validity of this approach through examples. We will provide instructions, code and other assets that are open source and available on GitHub.

// Bio
A Principal Applied AI Engineer at Redis, Sam helps guide the development and direction of Redis as an online feature store and vector database.

Sam's background is in high-performance computing including ML-related topics such as distributed training, hyperparameter optimization, and scalable inference.
Рекомендации по теме