Deploying and Monitoring LLM Inference Endpoints

preview_player

Показать описание

In this session we will dive into deploying LLMs to Production Inference Endpoints and then putting in place automated monitoring metrics and alerts to help track model performance and suppress potential output issues such as toxicity.

We will also cover the process of optimizing LLMs using RAG, for relevant, accurate, and useful outputs.

You will leave this session with a comprehensive understanding about deploying LLMs to production and monitoring the models for issues such as Toxicity, relevance, and accuracy.

[eventID:23187]

Microsoft Reactor

Рекомендации по теме