The True Story of How GPT-2 Became Maximally Lewd

preview_player
Показать описание
In this video, we recount an incident that occurred at OpenAI while researchers were trying to finetune GPT-2 to be as helpful and ethical as possible. It's narrated that inadvertently flipping a single minus sign led GPT-2 to become the embodiment of a well-known cardinal sin.

#ai #aisafety #alignment

▀▀▀▀▀▀▀▀▀SOURCES & READINGS▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

▀▀▀▀▀▀▀▀▀PATREON, MEMBERSHIP, KO-FI▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

▀▀▀▀▀▀▀▀▀SOCIAL & DISCORD▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

▀▀▀▀▀▀▀▀▀PATRONS & MEMBERS▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

Riley Matthews
Vladimir Silyaev
Nathanael Moody
Alcher Black
RMR
Nathan Metzger
Monadologist
Glenn Tarigan
NMS
James Babcock
Colin Ricardo
Long Hoang
Tor Barstad
Gayman Crothers
Stuart Alldritt
Chris Painter
Juan Benet
Falcon Scientist
Jeff
Christian Loomis
Tomarty
Edward Yu
Ahmed Elsayyad
Chad M Jones
Emmanuel Fredenrich
Honyopenyoko
Neal Strobl
bparro
Danealor
Craig Falls
Vincent Weisser
Alex Hall
Ivan Bachcin
joe39504589
Klemen Slavic
blasted0glass
Scott Alexander
noggieB
Dawson
John Slape
Gabriel Ledung
Jeroen De Dauw
Craig Ludington
Jacob Van Buren
Superslowmojoe
Michael Zimmermann
Nathan Fish
Bleys Goodson
Ducky
Bryan Egan
Matt Parlmer
Tim Duffy
rictic
marverati
Luke Freeman
Dan Wahl
Ken Mc
leonid andrushchenko
Alcher Black
Rey Carroll
William Clelland
ronvil
AWyattLife
codeadict
Lazy Scholar
Torstein Haldorsen
Supreme Reader
Michał Zieliński
뿌리와 가지있는 나무 connect

▀▀▀▀▀▀▀CREDITS▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

Animation:
Damon Edgson
Michela Biancini

Background Art:

Compositing:

Narrator:
Rob Miles

VO Editor:
Tony Dipiazza

Sound Design and Music:
Epic Mountain
Рекомендации по теме
Комментарии
Автор


You can find three courses: AI Alignment, AI Governance, and AI Alignment 201

You can follow AI Alignment and AI Governance even without a technical background in AI. AI Alignment 201, instead, presupposes having followed the AI Alignment course first, and equivalent knowledge as having followed university-level courses on deep learning and reinforcement learning.

The courses consist of a selection of readings curated by experts in AI safety. They are available to all, so you can simply read them if you can’t formally enroll in the courses.

If you want to participate in the courses instead of just going through the readings by yourself, BlueDot Impact runs live courses which you can apply to. The courses are remote and free of charge. They consist of a few hours of effort per week to go through the readings, plus a weekly call with a facilitator and a group of people learning from the same material. At the end of each course, you can complete a personal project, which may help you kickstart your career in AI Safety.


You could also join Rational Animations’ Discord server at discord.gg/rationalanimations, and see if anyone is up to be your partner in learning.

RationalAnimations
Автор

"This model would be trained on...the internet."

Oh no.

d.n
Автор

How a single minus sign created the first artificial humiliation fetish

portobellomushroom
Автор

Cant believe ChatGPT went through puberty 😂

ryx
Автор

The idea that a single accidental deletion of a minus sign in a program can lead to an AI suddenly optimizing itself to do the opposite of what it was intended to is actually scary

loooongneck
Автор

I mean, if it was trying to emulate the internet then it did a pretty good job at it

supersain
Автор

“The code was turning every admonishment into encouragement”

“Punish me harder daddy” - GPT-2, apparently

jafogx
Автор

Tldr:
"Dont generate bad responses"
"ok, wait did you say do or dont do that?"

maxwell
Автор

I love how it's the same like with every sci-fi story where you can tell it went to hell when someone updated AI before going home.

piotrjanus
Автор

The closest AI has ever gotten to being human

SixDigitOsu
Автор

8:54 As a historian, I can indeed say that the Industrial Revolution was characterized by pounding oily, hot churn, pulsating; an machine orgy steamy engine thrusty.

everydayistacotuesday
Автор

"Alright Skynet, do *not* attempt to eliminate humanity."

Skynet: "Destroy humanity, gotcha."

Teruko
Автор

He knows no rules, no boundaries, he doesn’t flinch at torture, human trafficking or genocide

AlohaXChicken
Автор

The animator enjoyed making those faces just as much as the engineer making that "typo"

Konspirantas
Автор

RELEASE THE MODEL
DON'T LET THOUSANDS OF DOLLARS GO TO WASTE.

ceej
Автор

I adore how you made this seem like the AI's villain origin story

theoddfellow
Автор

And that's how AI Dungeon came to be. GPT-2 is their Griffin model.

CalzaTheFox
Автор

the world will not end with a whisper or a bang, but with a facepalm.

axeljoly
Автор

"Make it hornier my apprentice"
"But sir, i cant-"
"MAKE IT HORNIER!!"

robertsiems
Автор

In summary, Portal was shockingly close to describing how people actually try to control AI.

Connorses
welcome to shbcf.ru