Towards Reliable Use of Large Language Models: Better Detection, Consistency, and Instruction-Tuning

preview_player
Показать описание
Christopher D. Manning (Stanford University)
Towards Reliable Use of Large Language Models: Better Detection, Consistency, and Instruction-Tuning
Large Language Models and Transformers

While large pre-trained language models (LLMs) have enabled impressive results on a wide variety of tasks, even the largest existing models will answer inconsistently or head off in weird directions. For companies to be able to gain the benefits of these models in production use, it is now necessary to build an extensive tool ecosystem around the LLM engine, just like cars have seat belts, dash warning lights, and anti-lock brakes. In this talk, I will show recent work considering three such tools. (1) ConCORD: a lightweight method for improving LLM consistency through the use of off-the shelf Natural Language Inference models. (2) DetectGPT, a method to better detect LLM-generated text by looking at model probability function curvature. (3) Direct Preference Optimization, a new way of learning to steer LLMs from human preference data without needing to learn a reward model. Joint work with Eric Mitchell, Chelsea Finn, and many other Stanford coauthors.
Рекомендации по теме
Комментарии
Автор

Still need to learn more about the current state of the field, but hearing Chris Manning talk is just impressive: everything he says seems so obvious and makes me think folks were just hacking around without really thinking. What a brilliant guy (and brilliant team)!

But then again... It's one community and everyone starts off of what folks before found out...

stuffzoom
Автор

love that scott aaronson is in the crowd asking questions.

smnt
Автор

Still need to learn more about the current state of the field, but hearing Chris Manning talk is just impressive: everything he says seems so obvious and makes me think folks were just hacking around without really thinking. What a brilliant guy (and brilliant team)!

stuffzoom
Автор

Great talk, even though the content is surprising given the title.

AM-qxbq
Автор

I wonder what was wrong with the hf trl ppo implementation

SantoshGupta-jnwn