filmov
tv
Testing an LLM | LLM Evaluating LLMs
Показать описание
Welcome to AI Testing Quest! 🚀
During the exploration of DeepEval and Evidently, it became obvious to me that it is common to use an LLM (usually the most recent OpenAI model) as a judge for test cases. But how trustworthy is our newly explored oracle? 🤖 💡👀
Can we measure that?
***SPOILER ALERT!***
Here is the resources links mentioned in the video:
📢 Stay Connected: We build a community! Connect with me on social media to stay updated on the learning process, and as we dig deeper together!
🔔 Don't Forget to Subscribe: Hit the bell icon to get notified whenever we go live or upload new content!
#aitestingquest #learning #testing #llm #ai #llmasajudge #oracle
During the exploration of DeepEval and Evidently, it became obvious to me that it is common to use an LLM (usually the most recent OpenAI model) as a judge for test cases. But how trustworthy is our newly explored oracle? 🤖 💡👀
Can we measure that?
***SPOILER ALERT!***
Here is the resources links mentioned in the video:
📢 Stay Connected: We build a community! Connect with me on social media to stay updated on the learning process, and as we dig deeper together!
🔔 Don't Forget to Subscribe: Hit the bell icon to get notified whenever we go live or upload new content!
#aitestingquest #learning #testing #llm #ai #llmasajudge #oracle