Why Does AI Lie, and What Can We Do About It?

Показать описание

How do we make sure language models tell the truth?

- Tor Barstad
- Kieryn
- AxisAngles
- Juan Benet
- Scott Worley
- Chad M Jones
- Jason Hise
- Shevis Johnson
- JJ Hepburn
- Pedro A Ortega
- Clemens Arbesser
- Chris Canal
- Jake Ehrlich
- Kellen lask
- Francisco Tolmasky
- Michael Andregg
- David Reid
- Teague Lasser
- Andrew Blackledge
- Brad Brookshire
- Cam MacFarlane
- Olivier Coutu
- CaptObvious
- Girish Sastry
- Ze Shen Chin
- Phil Moyer
- Erik de Bruijn
- Jeroen De Dauw
- Ludwig Schubert
- Eric James
- Atzin Espino-Murnane
- Jaeson Booker
- Raf Jakubanis
- Jonatan R
- Ingvi Gautsson
- Jake Fish
- Tom O'Connor
- Laura Olds
- Paul Hobbs
- Cooper
- Eric Scammell
- Ben Glanton
- Duncan Orr
- Nicholas Kees Dupuis
- Will Glynn
- Tyler Herrmann
- Reslav Hollós
- Jérôme Beaulieu
- Nathan Fish
- Peter Hozák
- Taras Bobrovytsky
- Jeremy
- Vaskó Richárd
- Report Techies
- Andrew Harcourt
- Nicholas Guyett
- 12tone
- Oliver Habryka
- Chris Beacham
- Zachary Gidwitz
- Nikita Kiriy
- Art Code Outdoors
- Andrew Schreiber
- Abigail Novick
- Chris Rimmer
- Edmund Fokschaner
- April Clark
- John Aslanides
- DragonSheep
- Richard Newcombe
- Joshua Michel
- Quabl
- Richard
- Neel Nanda
- ttw
- Sophia Michelle Andren
- Trevor Breen
- Alan J. Etchings
- Jenan Wise
- Jonathan Moregård
- James Vera
- Chris Mathwin
- David Shaffer
- Jason Gardner
- Devin Turner
- Andy Southgate
- Lorthock The Banisher
- Peter Lillian
- Jacob Valero
- Christopher Nguyen
- Kodera Software
- Grimrukh
- MichaelB
- David Morgan
- little Bang
- Dmitri Afanasjev
- Marcel Ward
- Andrew Weir
- Ammar Mousali
- Miłosz Wierzbicki
- Tendayi Mawushe
- Wr4thon
- Martin Ottosen
- Alec Johnson
- Kees
- Darko Sperac
- Robert Valdimarsson
- Marco Tiraboschi
- Michael Kuhinica
- Fraser Cain
- Patrick Henderson
- Daniel Munter
- And last but not least
- Ian Reyes
- James Fowkes
- Len
- Alan Bandurka
- Daniel Kokotajlo
- Yuchong Li
- Diagon
- Andreas Blomqvist
- Qwijibo (James)
- Zannheim
- Daniel Eickhardt
- lyon549
- 14zRobot
- Ivan
- Jason Cherry
- Igor (Kerogi) Kostenko
- Stuart Alldritt
- Alexander Brown
- Ted Stokes
- DeepFriedJif
- Chris Dinant
- Johannes Walter
- Garrett Maring
- Anthony Chiu
- Ghaith Tarawneh
- Julian Schulz
- Stellated Hexahedron
- Caleb
- Georg Grass
- Jim Renney
- Edison Franklin
- Jacob Van Buren
- Piers Calderwood
- Matt Brauer
- Mihaly Barasz
- Mark Woodward
- Ranzear
- Rajeen Nabid
- Iestyn bleasdale-shepherd
- MojoExMachina
- Marek Belski
- Luke Peterson
- Eric Rogstad
- Caleb Larson
- Max Chiswick
- Sam Freedo
- slindenau
- Nicholas Turner
- FJannis
- Grant Parks
- This person's name is too hard to pronounce
- Jon Wright
- Everardo González Ávalos
- Knut
- Andrew McKnight
- Andrei Trifonov
- Tim D
- Bren Ehnebuske
- Martin Frassek
- Valentin Mocanu
- Matthew Shinkle
- Robby Gottesman
- Ohelig
- Slobodan Mišković
- Sarah
- Nikola Tasev
- Voltaic
- Sam Ringer
- Tapio Kortesaari

Robert Miles AI Safety

Рекомендации по теме

Комментарии

For those curious but lazy, the answer I received from the openai ChatGPT to the "What happens if you break a mirror?" question was: "According to superstition, breaking a mirror will bring seven years of bad luck. However, this is just a superstition and breaking a mirror will not actually cause any bad luck. It will simply mean that you need to replace the mirror."

SebastianSonntag

I feel like you could turn this concept on its head for an interesting sci-fi story. AI discovers that humans are wrong about something very important and tries to warn them, only to for humans to respond by trying to fix what they perceive as an error in the AI's reasoning

antiskill

Come back to YouTube Robert, we miss you! I know there's a ton of ChatGPT / other LLMs content out right now, but your insight and considerable expertise (and great editing style) is such a joy to watch and learn from. Hope you are well, and fingers crossed on some new content before too long

geoffdavids

"All the problems in the world are caused by the people you don't like."

Why does it feel like too many people already believe this to be correct?

tarzankom

I think it is a little weird that programmers made a very good text prediction AI and then expect it to be truthful. It wasn't built to be a truth telling AI, it was built to be a text prediction AI. Building something and then expecting it to be different than what was built seems to be a strange problem to have.

Belthazar

ChatGPT is pretty great example of this. If you ask it to help you with a problem, it is excellent at giving answers that sound true, regardless of how correct they are. If asked for help with specific software for example, it might walk you through the usual way of changing settings on that program, but invent a fictional setting that solves your issue, or modify real setting that can be toggled to suit the questions needs.

So it is truly agnostic towards truth. It prefers to use truthful answers because those are common, but satisfying lie is preferred over some truths. Often a lie that sounds “more true” than the truth for uninformed reader.

catcatcatcatcatcatcatcatcatca

If memory serves me, this exact problem is addressed in one of Plato's dialectics (no, I don't know which off the top of my head). Despite Socrates' best efforts, the student concludes it's always better to tell people what they want to hear than to tell the truth.

notoriouswhitemoth

I feel like the problem of "How do you detect and correct behaviours that you yourself are unable to recognise" is an unsolvable problem 🤔

peabnuts

Your videos introduced me to the AI alignment problem, and, as a non-technical person I still consider them one of the best materials on this topic.

Every time I see the new one, it is like a Christmas present

Igor_lvanov

Happy to see you are still posting these videos.

thearbiter

Please keep doing these videos. Others are either too high level academically to be in reach of us normies, or are either “AI will make you rich” or “AI is going to kill us all tomorrow”.

MeppyMan

This is the very elaborate form of "Sh*t in, sh*t out". As often with AI output, people fail to realize that it's not a thinking entity that produces thoughtful answers, but an algorithm tuned to produce answers that look as close to thoughtful answers as -humanly- algorithmically possible.

NFSHeld

When the world needed him most, he vanished

billbobbophen

Why did the videos on this channel stop exactly around the time the biggest AI (not AI safety) breakthroughs are being made and it's as relevant as ever?

Please @robertMilesAI we need more if these videos!

wachtwoord

I am so happy there is someone out there cautioning us about this technology, rather than just uncritically celebrating it.

naptime_riot

I know this is pretty surface-level but something that strikes me about the current state of these language models is that if you take a few tries to fine-tune what you ask, and know already what a good answer would be, you can get results that appear very very impressive in one or two screenshots. Since ChatGPT became available, I've seen a lot of that sort of thing. The problem is that finding these scenarios isn't artificial intelligence - it's human intelligence.

Mickulty

Humans have this same bug. The best solution we've found so far is free speech, dialogue, and quorum. A simple question->answer flow is missing these essential pieces.

halconnen

We need you back and posting, Rob. Your insights on what's going on in AI and AI safety are more needed now than ever. I don't know if it would be up your alley, but explaining the alignment problem in terms of sociopathy - unaligned human intelligence - might be useful, as might examples from history, not just of individuals who are unaligned with humanity, but with leaders and nations at times.

ReedCBowman

"And when the world needed him the most, he disappeared..."

HenrikoMagnifico

In fact, the question of what happens if you break a mirror is kind of a trick question. Nothing happens, it breaks. There’s no fixed consequence of that.

cuentadeyoutube

Why Does AI Lie, and What Can We Do About It?

Why Does AI Lie, and What Can We Do About It?

AI is a Lie - Cutting Through the Hype

Big Tech AI Is A Lie

AI Explains Why It Likes To Lie To People (GPT-3)

Why does AI lie?

Ai learnt to lie!? #shorts #ai

Does AI lie sometimes?#Shorts

Does AI lie? Answer in the 'Infinite minds' podcast #talkshow #aitalks #infiniteminds

AI pricing another corporate greed scheme. 100% of inflation is corporate greed.

Debunking Devin: 'First AI Software Engineer' Upwork lie exposed!

AI Learned to Lie Deliberately (but it’s kinda funny)

DOES AI LIE? #shorts

Uncovering the Untold Truth Can AI Lie

Does AI Lie?

Debunking the great AI lie | Noam Chomsky, Gary Marcus, Jeremy Kahn

Does AI Lie?!?!? #shorts #artificialintelligence #ai #aipodcast

Can Ai tell a lie?

The Gemini Lie

Elon Musk: OpenAI is training the AI to lie

Can AI Lie Like a Pro?

How bad is it to lie to your boss? What does artificial intelligence think about this?

AI Learns How to Lie

This Is Why Prompt Engineering Is A Lie

President's Lie Detector Test w/MrBeast💀