filmov
tv
FAR Seminar: Ethan Perez – Sleeper Agents
Показать описание
Ethan Perez presented "Sleeper Agents: Deceptive LLMs that Persist Through Safety Training" on February 15, 2024 at FAR Labs, as part of the FAR Seminar series.
FAR․AI
Рекомендации по теме
0:57:37
FAR Seminar: Ethan Perez – Sleeper Agents
1:18:08
Ethan Perez | Discovering language model behaviors with model-written evaluations
2:01:41
Ethan Perez - Discovering Language Model Behaviors with Model Written Evaluations
0:52:18
Ethan Perez - Making AI safe through debate
0:53:55
Discovering AI Risks with AIs | Ethan Perez | EAG Bay Area 23
0:36:40
Ethan Perez (Anthropic) - Bottom-Up Alignment Research
1:30:35
IICCSSS 2022 - Ethan Perez: Aligning Language Models with Human Preferences
1:10:01
Lightning Talks (Day 1)
1:03:30
Lunch & Learn | Intro to NLP and language models with Ethan Perez
0:41:51
Friends of Mineralogy: Pacific NW Chapter - 2020 Fall Symposium - Part III - Dr. Raquel Alonso-Perez
0:00:28
This can happen in Thailand
0:58:19
CodeParrot and Beyond: Building Language Models for Code in the Open
0:00:12
Kylie Jenner new boyfriend, Timothée Chalamet, having a painful accident on NYC set ❤️🩹
0:00:11
NEVER FLYING SPIRIT AIRLINES AGAIN ‼️😤 #shorts
0:00:10
Erik Ten Hag and Cristiano Ronaldo #shorts #manchesterunited
0:00:05
MY FIRST EVER 3 POINT MISS
0:00:15
Growing up Pentecostal... #short
0:54:56
Finding and Fixing Undesirable Behaviors in Pretrained Language Models
0:00:16
Dwight Powell forgot how tall he was 💀
0:00:09
Ethan Jimenez en su debut profesional 🔥🥊 #boxingempyre #boxing #boxeo #ethanjimenez
0:00:10
After a UFC Fight #Shorts
0:31:07
Accelerating AI Safety Talent - Ryan Kidd | AI Safety Forum 2024
1:01:28
How far have we come in giving our NLU systems common sense?
0:00:23
She Gets A SURPRISE! 😱