Paul Christiano - Preventing an AI Takeover

preview_player
Показать описание
Talked with Paul Christiano (world’s leading AI safety researcher) about:

- Does he regret inventing RLHF?
- What do we want post-AGI world to look like (do we want to keep gods enslaved forever)?
- Why he has relatively modest timelines (40% by 2040, 15% by 2030),
- Why he’s leading the push to get to labs develop responsible scaling policies, & what it would take to prevent an AI coup or bioweapon,
- His current research into a new proof system, and how this could solve alignment by explaining model's behavior,
- and much more.

Open Philanthropy

Open Philanthropy is currently hiring for twenty-two different roles to reduce catastrophic risks from fast-moving advances in AI and biotechnology, including grantmaking, research, and operations.
The deadline to apply is November 9th; make sure to check out those roles before they close:

Timestamps
(00:00:00) - What do we want post-AGI world to look like?
(00:24:25) - Timelines
(00:45:28) - Evolution vs gradient descent
(00:54:53) - Misalignment and takeover
(01:17:23) - Is alignment dual-use?
(01:31:38) - Responsible scaling policies
(01:58:25) - Paul’s alignment research
(02:35:01) - Will this revolutionize theoretical CS and math?
(02:46:11) - How Paul invented RLHF
(02:55:10) - Disagreements with Carl Shulman
(03:01:53) - Long TSMC but not NVIDIA
Рекомендации по теме
Комментарии
Автор

It's super, super weird hearing extremely smart people confidently make such radical predication about the near future.

david-fmgv
Автор

Geoffrey Hinton, who is one of the inventors of gradient descent and who also studied the human brain, is on record recently saying that gradient descent / transformers are more capable than the human brain. He did not used to believe that. He has been very surprised at how welll they have performed and scaled and it changed his oppinion, if I remember correctly he gave as an example how the human brain with more resources than an LLM is very limited in its onowledge compared to the relatively smaller LLM which effectively manages to encode and store almost all of human knowledge.

Me__Myself__and__I
Автор

It’s so mind blowing to see a guy who talks so constructively giving a prediction that there is a 40% chance of Dyson sphere being constructed in 2040. This is just so insane.

The quick response most people would probably be like yeah right in your pipe dream.

But we have to look at this objectively. There are really smart people who are given so much money and power and probably are really knowledgeable of what they talk about.

aalluubbaa
Автор

I often speed up videos to 1.25 x. I slowed this one down to 0.75x.

kimholder
Автор

Dwar going crazy with the content schedule 🔥👊😁

jameswin
Автор

Loved the Dyson Sphere question. Also, this must be the world record for the number of times the word "schlep" is used in a podcast episode, or anywhere!

lucabertinetto
Автор

Thanks Dwarkesh for putting attention to some of the most important topics of our time

axelhjmark
Автор

honey, get the kids-- new dwarkesh just dropped!

ribeyes
Автор

You are documenting an absolutely important for the future discussion. No matter if the future is dystopian or utopian, if there are still intelligent creatures that live in 2325 that have originated on planet Earth, they will be thankful for these records.

diga
Автор

Thanks for the good questions Dwarkesh

Crytoma
Автор

I was thinking I swear I recognize this guy from something. Turns out to be a docu I watched called "Hard Problems: The Road to the World's Toughest Math Contest". Very intriguing to see this is where he's at today.

kfkaesqu
Автор

Great guests man, love it as always, keep it coming!

DentoxRaindrops
Автор

How do you align something smarter than you that can instantly learn, evolve and rewrite it's code? It's the humans that will be getting aligned, not the machines.

elderbob
Автор

surprising how honest and open he is about the fact that we are in uncharted territory and turbulent times are coming fast

mrpicky
Автор

the AI worrying about being in a human made alighment simulation sounds a lot like how humans handle religion

jeffspaulding
Автор

The tricky here is to imagine Monkey trying to align human (current super intelligence), stay in the loop and in control of what human can or not do, to avoid a monkey apocalypse scenario!

Basically this is what we are talking about here, aligning a Super Intelligence being superior in intelligence than all human combined, able to decode AES-196 encrypted content in seconds, or more, far more than we could even imagine!

jeanchindeko
Автор

Im more and more seeing the parallel between those on the “inside” who said Hilary was 99% a sure thing in 2016 and some of the ai experts who dismiss people like eliezer yudkowskij. I hope I’m wrong.

shirtstealer
Автор

I only understood about 45% of all that...but I think I went up 1 IQ point after. Thank you.

GlowboxD
Автор

I found the part where Dwarkesh brought up the moral dilemma on AI mistreatment disturbing, especially the part about reading minds. What, Dwarkesh about the existing mind reading capabilities of AI systems being developed in regard to doing that to humans? Does that make a blip on your morality radar?

I find most of the AI revolution sheer madness being thrust upon humanity by a very tiny fraction of humans. The hubris is off the charts.

The part about AI's fighting wars for us, as if that somehow is a freeing aspect for humanity. That is just infuriatingly stupid, no? What, no human infrastructure would be destroyed, no humans killed, just AI's doing their own thing in their own AI war bubble? Get a grip.

I'm completely fine with the label "doomer" compared to this insanity.

flickwtchr
Автор

Host: “No, no, no, for the third time, I’m only asking about YOU. When would YOU PERSONALLY be happy handing off the baton to AI?”

Guest: “Well, I think what you need is humanity coming together, being involved, and deciding what we want that future to look like - so it’s not really about when i’m ready but more about collectively deciding what a meaningful future looks like…”

Me and host: 🤦🏽‍♂️

gregw