LLM Evaluation Basics: Datasets & Metrics

preview_player

Показать описание

This is an introduction to evaluating Large Language Models (LLMs), which covers what a dataset is, how we measure performance, and how automatic and human evaluation are done.

Generative AI at MIT

Рекомендации по теме

Комментарии

I agree with the other commenter. Also, the flashing toolbar up top was very distracting

jonnymiller

IF there is a code demo, it would have helped

vigneshnagaraj

join shbcf.ru