Boosting Databases and Language Models

Показать описание

Sam discusses the capabilities of Redis and Redis Search, emphasizing the power of secondary indexing on JSON documents and various integrations. They also highlight the relevance AI integration, which provides a user interface for vector databases. However, the most exciting development mentioned is the upcoming DPU index in collaboration with NVIDIA. Sam addresses the relevance of databases in relation to large language models and emphasizes the limitations of these models when it comes to personalized, confidential, and rapidly changing information.

This LLMs in Production Conference section is proudly sponsored by Redis.

// Abstract
Generative models such as ChatGPT have changed many product roadmaps. Interfaces and user experience can now be re-imagined and often drastically simpified to what resembles a google search bar where the input is natural language. However, some models remain behind APIs without the ability to re-train on contextually appropriate data. Even in the case where the model weights are publically available, re-training or fine-tuning is often expensive, requires expertise and is ill-suited to problem domains with constant updates. How then can such APIs be used when the data needed to generate an accurate output was not present in the training set because it is consistently changing? Vector embeddings represent the impression a model has of some, likely unstructured, data. When combined with a vector database or search algorithm, embeddings can be used to retrieve information that provides context for a generative model. Such embeddings, linked to specific information, can be updated in real-time providing generative models with a continually up-to-date, external body of knowledge. Suppose you wanted to make a product that could answer questions about internal company documentation as an onboarding tool for new employees. For large enterprises especially, re-training model this ever-changing body of knowledge would be untenable in terms of a cost to benefit ratio. Instead, using a vector database to retrieve context for prompts allows for point-in-time correctness of generated output. This also prevents model "hallucinations" as models can be instructed provide no answer when the vector search returns results below some confidence threshold. In this talk we will demonstrate the validity of this approach through examples. We will provide instructions, code and other assets that are open source and available on GitHub.

// Bio
A Principal Applied AI Engineer at Redis, Sam helps guide the development and direction of Redis as an online feature store and vector database.

Sam's background is in high-performance computing including ML-related topics such as distributed training, hyperparameter optimization, and scalable inference.

Рекомендации по теме

Boosting Databases and Language Models

Boosting Databases and Language Models

Boosting Language Models with Vector Databases: Revolutionizing Information Retrieval.

Vector Database for GenAI and LLM Applications

LLM Explained | What is LLM

What is Retrieval-Augmented Generation (RAG)?

How to Build LLMs on Your Company’s Data While on a Budget

Vector Databases and Large Language Models (LLM): Hands-On with ChromaDB

How to Train Your Own Large Language Models

Boosting LLM/RAG Workflows & Scheduling w/ Composable Memory and Checkpointing // Bernie Wu // #...

Uniting Large Language Models and Knowledge Graphs for Enhanced Knowledge Representation

Andrew Ng's Secret to Mastering Machine Learning - Part 1 #shorts

Retrieval Augmented Generation (RAG): Boosting LLM Performance with External Knowledge

[1hr Talk] Intro to Large Language Models

Retrieval Augmented Generation (RAG) Explained: Embedding, Sentence BERT, Vector Database (HNSW)

Boost LLM/AI Applications with a Highly Scalable SQL/Relational Vector Database

Boost AI Applications with SQL/Relational Vector Database, Linpeng Tang, Co-founder, MyScale

OpenAI Embeddings and Vector Databases Crash Course

Lost in the Middle: How Language Models use Long Context - Explained!

'I want Llama3 to perform 10x with my private knowledge' - Local Agentic RAG w/ llama3

Building Recommender Systems with Large Language Models // Sumit Kumar // LLMs in Production

Boost App Search Speed with Azure AI & Postgres: Faster, Secure, On-Server Vector Embeddings

Vector Database for Large Language Models in Production

Introduction to Generative AI

RAG vs. Function Calling - Large Language Models | CodiLime