Hallucination-Free LLMs: Strategies for Monitoring and Mitigation

Показать описание

The talk will cover why and how to monitor LLMs deployed to production. We will focus on the state-of-the-art solutions for detecting hallucinations, split into two types:
1. Uncertainty Quantification
2. LLM self-evaluation

In the Uncertainty Quantification part, we will discuss algorithms to leverage token probabilities to estimate the quality of model responses. This includes simple accuracy estimation and more advanced methods for estimating Semantic Uncertainty or any classification metric.

In the LLM self-evaluation part, we will cover using (potentially the same) LLM to quantify the quality of the answer. We will also cover state-of-the-art algorithms such as SelfCheckGPT and LLM-eval.

You will build an intuitive understanding of the LLM monitoring methods, their strengths and weaknesses, and learn how to easily set up an LLM monitoring system.

Table of Contents:
00:00 Introduction
2:33 What is LLM Monitoring
8:10 LLM-Based Hallucination Detection: Consistency
12:43 LLM-Based Hallucination Detection: Answer Evaluation
17:12 Output Uncertainty Quantification
23:00 Semantic Uncertainty Quantification
29:10 Experiment Results
----------

👉 Learn more about Data Science Dojo here:

👉 Watch the latest video tutorials here:

👉 See what our past attendees are saying here:
--
At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 8000+ employees from over 2000+ companies globally, including many leaders in tech like Microsoft, Apple, and Facebook.
--

-----
#ArtificialIntelligence #AI #MachineLearning #DataScience #LargeLanguageModels #llm #hallucinations

Рекомендации по теме

Комментарии

It's quite obvious to run a second AI agent as a supervisor who will evaluate the answers.
Running a team of AI agents gives significantly better results in every metric and in cases where this makes sense, the more expensive models can instruct less powerful, less expensive models to make it more cost effective.

yarpenzigrin

Hallucination-Free LLMs: Strategies for Monitoring and Mitigation

Hallucination-Free LLMs: Strategies for Monitoring and Mitigation - Wojtek Kuberski, NannyML

Hallucination-Free LLMs: Strategies for Monitoring and Mitigation

LLM Calibration and Automatic Hallucination Detection via Pareto Optimal Self-supervision

Mitigating LLM Hallucinations with a Metrics-First Evaluation Framework

Solving Gen AI Hallucinations

Chat GPT Hallucination | Problem with GPT Hallucination

Lecture 13: LLMs Hallucination and Confabulation | LLMs Challenges | Artificial Intelligence |

Woodpecker: Hallucination Correction for Multimodal Large Language Models

5 Easy Ways to help LLMs to Reason

Prompt Engineering Tutorial – Master ChatGPT and LLM Responses

Managing Hallucinations in Your LLM Apps | SingleStore Webinars

Hallucination Detector for LLMs

LLM Evaluation Essentials: Statistical Analysis of Hallucination LLM Evaluations

Determinism in the AI Tech Stack (LLMs): Temperature, Seeds, and Tools

Mitigating Hallucinations in Foundation Language Models | Sriram Gudimella

Sponsored by: DataRobot | Why LLM Hallucinations are Great for You

6. An Introduction to RAG - Part of the Free Ollama Course

Testing the limits of ChatGPT and discovering a dark side

LF Live Webinar: AI Hallucinations and Manipulation: How to Use AI Coding Tools Securely

Benchmarking Hallucination Detection

Avoid AI HALLUCINATIONS - Your AI model has the right to remain silent

AI Is Dangerous, but Not for the Reasons You Think | Sasha Luccioni | TED

LangSmith Tutorial - LLM Evaluation for Beginners

Building Production-Ready RAG Applications: Jerry Liu