Quantilizers: AI That Doesn't Try Too Hard

Показать описание

How do you get an AI system that does better than a human could, without doing anything a human wouldn't?

Links:

With thanks to my excellent Patreon supporters:

Timothy Lillicrap
Gladamas
James
Scott Worley
Chad Jones
Shevis Johnson
JJ Hepboin
Pedro A Ortega
Said Polat
Chris Canal
Jake Ehrlich
Kellen lask
Francisco Tolmasky
Michael Andregg
David Reid
Peter Rolf
Teague Lasser
Andrew Blackledge
Frank Marsman
Brad Brookshire
Cam MacFarlane
Vivek Nayak
Jason Hise
Phil Moyer
Erik de Bruijn
Alec Johnson
Clemens Arbesser
Ludwig Schubert
Allen Faure
Eric James
Matheson Bayley
Qeith Wreid
jugettje dutchking
Owen Campbell-Moore
Atzin Espino-Murnane
Johnny Vaughan
Jacob Van Buren
Jonatan R
Ingvi Gautsson
Michael Greve
Tom O'Connor
Laura Olds
Jon Halliday
Paul Hobbs
Jeroen De Dauw
Lupuleasa Ionuț
Cooper Lawton
Tim Neilson
Eric Scammell
Igor Keller
Ben Glanton
anul kumar sinha
Duncan Orr
Will Glynn
Tyler Herrmann
Tomas Sayder
Ian Munro
Jérôme Beaulieu
Nathan Fish
Taras Bobrovytsky
Jeremy
Vaskó Richárd
Benjamin Watkin
Sebastian Birjoveanu
Andrew Harcourt
Luc Ritchie
Nicholas Guyett
James Hinchcliffe
12tone
Chris Beacham
Zachary Gidwitz
Nikita Kiriy
Parker
Andrew Schreiber
Steve Trambert
Mario Lois
Abigail Novick
heino hulsey-vincent
Fionn
Dmitri Afanasjev
Marcel Ward
Richárd Nagyfi
Andrew Weir
Kabs
Miłosz Wierzbicki
Tendayi Mawushe
Jannik Olbrich
Jake Fish
Wr4thon
Martin Ottosen
Robert Hildebrandt
Andy Kobre
Poker Chen
Kees
Darko Sperac
Paul Moffat
Robert Valdimarsson
Marco Tiraboschi
Michael Kuhinica
Fraser Cain
Robin Scharf
Klemen Slavic
Patrick Henderson
Oct todo22
Melisa Kostrzewski
Hendrik
Daniel Munter
Alex Knauth
Kasper
Rob Dawson
Ian Reyes
James Fowkes
Tom Sayer
Len
Alan Bandurka
Ben H
Simon Pilkington
Daniel Kokotajlo
Diagon
Andreas Blomqvist
Bertalan Bodor
David Morgan
Zannheim
Daniel Eickhardt
lyon549
HD
Ihor Mukha
14zRobot
Ivan
Jason Cherry
Igor (Kerogi) Kostenko
ib_
Thomas Dingemanse
Stuart Alldritt
Alexander Brown
Devon Bernard
Ted Stokes
James Helms
Jesper Andersson
Jim T
DeepFriedJif
Chris Dinant
Raphaël Lévy
Johannes Walter
Matt Stanton
Garrett Maring
Anthony Chiu
Ghaith Tarawneh
Julian Schulz
Stellated Hexahedron
Caleb
Scott Viteri
Clay Upton
Conor Comiconor
Michael Roeschter
Georg Grass
Isak
Matthias Hölzl
Jim Renney
Edison Franklin
Piers Calderwood
Krzysztof Derecki
Mikhail Tikhomirov
Richard Otto
Matt Brauer
Jaeson Booker
Mateusz Krzaczek
Artem Honcharov
Michael Walters
Tomasz Gliniecki
Mihaly Barasz
Mark Woodward
Ranzear
Neil Palmere
Rajeen Nabid
Christian Epple
Clark Schaefer
Olivier Coutu
Iestyn bleasdale-shepherd
MojoExMachina
Marek Belski
Eric Eldard
Eric Rogstad
Eric Carlson
Caleb Larson
Braden Tisdale
Max Chiswick
Phillip Brandel

Robert Miles AI Safety

Рекомендации по теме

Комментарии

I love how AI safety is an entire academic field that can seemingly be reduced to an endless game of "okay, but what about THIS strategy?" "Nah, that wouldn't work either..."

OlleLindestad

"Certain events transpired"

Everyone thinks he's talking about Corona when in reality he had to fix a stamp collector AI that someone created without having seen his videos

DroCaMk

“A finite number of times less safe than a human” I’m stealing this line, it’s gold.

qedsoku

The only guy whos hair got neater during lockdown

WeirdSide

Forgotten?! Bro, I come back to your videos once in a while, I love these things!
Please continue to make videos like this, it's great :)

LinucNerd

Would adding a minimum human likelihood on top of the quantilizer not remove (many of) the max-utility apocalypse scenarios?

Huntracony

08:17 As a human who absolutely would mod themselves to be an expected utility satisficer, I find this content offensive.

getsmartwithrory

"your model might not generalize well to something outside it's training data"

"Hey GPT-3 how do you move a sofa around a corner?"

GPT-3: *GET A SAW A CUT OPEN THE WALL*

petersmythe

A human could still do a lot of crazy dangerous things that have a high utility, like, doing parkour to get to a place very efficiently... or ending a war throwing nuclear bombs over two cities... Which makes me think also that the data used to imitate humans might be biased or mis-represented/justified... Good vid as always. Nice to see you around. Keep'em coming!

DamianReloaded

Wouldn't the extremely powerful optimizer, given the goal of "imitate the behavior of a human", first turn the Earth into computronium so that it can then more accurately compute its simulation of a virtual human? Or at least capture and enslave real humans to use as reference?

Interestingly, neural networks that attempt to approximate human behavior are very unlikely to do this, because stochastic gradient descent is a very _weak_ optimizer. It's only the neural network training system as a whole that is a good optimizer. So I guess there's a strange question of what level of meta your optimizer is running on, and whether a sufficiently powerful optimizer could "break the rules" and realize it was on one level but could achieve more accurate results by being on another.

The quantalizer model also reminds me of adversarial neural networks. It's almost like having an optimizer spitballing ideas combined with an adversarial human model saying, "no, that's a terrible idea." Which makes me wonder whether the optimizer would generate high-utility ideas that superficially look humanlike but in fact lead to the end of the world when implemented. They may even _be_ humanlike, since humanity is already well on its way to destroying itself even outside of AI research. "Burn all the fossil fuels for energy until the planet fries to a crisp" is a very humanlike behavior.

So what we really need is an AI that is not only _smarter_ than humans, but also _wiser_ than humans. We need a model of ethics that is better than that of humans, according to some ineffable definition of "better". Talk about a tall order.

AaronRotenberg

Thanks for a really good video. Just a few of points that I thought of:

- Wouldn't it be clearer if you plotted the product of the expected utility and the clipped human probability to give the expected utility conditioned on the human probability (I think)? That might make the changes between the outcomes clearer between the clipped and unclipped versions.
- Doesn't the quantilizer approach become very sensitive to how well it predicts small human probabilities? Are they relying on a conservative model of the human probabilities that just rounds to 0 when there is not enough confidence in the prediction? (but what about confidence in the confidence...)
- It might be worth noting the limits of numerical accuracy in machines and humans (the idea that there is a limit to the size of differences that both humans and machines can compare).

Just some thoughts. Thank you again for another excellently informative and engaging video.

ej

"A human is very unlikely to modify itself into a utility maximizer" buckle up boy. We're going for a ride.

saganmcvander

a new video of yours is as rare as it is great. please keep making them so I can spend copious amounts of time rewatching them :)

jfbaltazar

Great to see you still making videos :)

Me and the IT department watch them together during lunchtime!

toreshimada

Man, I've had this question--albeit in much less articulate terms--since GPT-3 was launched. I'm glad to have an analysis from my favorite Nottingham researcher.

I gotta' say, 'a finite number of times less safe than a human' sounds a lot more favorable than I expected an approach like this to be.

harrisonfackrell

An idea that jumps to mind immediately, regarding the whole "might build a utility maximizer" thing, why not have an upper cutoff as well?

As in, you discard the bottom 70% of "things a human might do" AND the top... Say, 5%, and use that 25% chunk as what you randomly select from (after renormalizing it to be a proper probability distribution). Wouldn't that cut out the weirder, apocalyptic strategies like "build a utility maximizer because it'll make a lot of stamps"?

halyoalex

This is the kind of progress on this question that actually makes me kind of hopeful that we'll actually have safe AGI, if AGI is possible.

Obviously not all the way there but pretty good progress towards it.

NickCybert

Really great video! I have two questions:

It seems that whatever system we consider, there is a kind of infinite regress because of self modification or construction of another agent. Since this seems to be at the heart of the problem, what kind of things can we imagine to do to avoid these types of problem?

Also, even if we prevent the AI from modifying itself or creating another agent to do its job, isn't there also a more probable possibility that it might try to use another unsafe agent to do its job, like manipulating a human to make him buy the stamps for instance? Especially using a quantilizer as humans tend to delegate work to other humans very often. Wouldn't an AI agent be trying to become obsolete almost inevitably?

Bencurlis

AGI: facism is a thing some humans have tried before let's go do that."

imacds

interesting how you always post a new video when i rewatch some of your older ones. I should do that more often...

AndDiracisHisProphet

Quantilizers: AI That Doesn't Try Too Hard

Quantilizers: AI That Doesn't Try Too Hard

AI That Doesn't Try Too Hard - Maximizers and Satisficers

Why Does AI Lie, and What Can We Do About It?

There's No Rule That Says We'll Make It

Why Do I Avoid Sci-fi?

Avoiding Positive Side Effects: Concrete Problems in AI Safety part 1.5

Intro to AI Safety, Remastered

How to Keep Improving When You're Better Than Any Teacher - Iterated Distillation and Amplifica...

Respectability

Will A.I. Take Over?: The State of Artificial Intelligence | Video Essay | Analysis

Scalable Supervision: Concrete Problems in AI Safety Part 5

AI YTP: Robert Miles wants to kill off the remaining population

Deceptive Misaligned Mesa-Optimisers? It's More Likely Than You Think...

Rob Miles: AI Safety

201. Corrigibility

Where do we go now?

Ai Butchers Monstercat Artwork

Free ML Bootcamp for Alignment #shorts

Understanding Neural Networks

Online Profiling

NVIDIA - The Vision of AI and the future of simulation

'I never thought I'd see a resonance cascade, let alone create one.'

How Different Decision Styles Impact Success at Work with Spencer Fraseur (EP 1)

137. Value learning 2