How to evaluate AI applications

preview_player
Показать описание
Vertex AI Evaluation Service Tutorial Notebooks →

How do developers know if their AI applications are working effectively? How can developers measure AI performance? In this episode of Real Terms for AI, Googlers Aja Hammerly and Jason Davenport delve into creating golden datasets, defining essential metrics, and utilizing tools to measure any AI application's performance.

Chapters:
0:00 - Welcome
0:34 - Evaluating models versus evaluating apps
1:31 - Grounding
2:17 - Sources of evaluation data
3:47 - Define metrics and evaluation
5:07 - Analyzing and understanding metrics
6:19 - Ongoing evaluation
7:48 - Summary

#GoogleCloud #GenerativeAI

Speakers: Aja Hammerly, Jason Davenport
Products Mentioned: Gemini, Cloud General, Vertex AI
Рекомендации по теме
Комментарии
Автор

I found the documentation titled "Gen AI evaluation service overview" to be more relevant to this video than the documentation linked in the video description.

Thanks for these videos.

ElliotK-IB
Автор

Very good methods to maintain stability between iterations in an AI application. Techniques could be applied in production to use in the case of changes in the prompt or data source the use of a certain percentage of traffic using a Gateway and observe the behavior of the new settings. Then we can increase our traffic until we refine the new changes to 💯

davcoding
Автор

From what i understand Vertex AI has a built in evaluation service. How does it compare to off the shelf vendors like Trulens and ragas?

dheer
Автор

Sir pls post some videos for cloud security implementation using cryptography and ML algorithm..just example..How to implement..

ANIPOOSBEATZ