Deepmind AlphaZero - Mastering Games Without Human Knowledge

preview_player
Показать описание
2017 NIPS Keynote by DeepMind's David Silver. Dr. David Silver leads the reinforcement learning research group at DeepMind and is lead researcher on AlphaGo. He graduated from Cambridge University in 1997 with the Addison-Wesley award.

Recorded: December 6th, 2017
Рекомендации по теме
Комментарии
Автор

The best exposition I've seen to date on what promises to be an AGI

palfers
Автор

Thank you for an excellent explanation. I'm looking forward to seeing where this leads.

SafeTrucking
Автор

Amazing talk, thanks to speaker and uploader

drancisdrake
Автор

❤Thank you very much publisher beautiful lesson and demonstration..

petergreen
Автор

I'd love to see this in more complex and open ended computer games. If you tell AlphaZero to play Cities Skylines and maximize the population and add secondary constraints like environmental quality and rci balance, I wonder what it would come up with

kayrosis
Автор

I wonder if you could use this to analyze where a kid is going wrong in his math understanding for example, as a tool to teach kids math. It could pinpoint the area of confusion and help the kid bridge that and gain insight by providing simpler examples.

peters
Автор

It's a wonderful achievement.
I think that it has the potential to change the world.

alph
Автор

Just wait to see what will happen when we achieve "reinforcement learning learning": when reinforcement learning can improve the reinforcement learning algorithm itself.

kephalopod
Автор

If you look at the three graphs at 30:25, you'll notice "jumps" in all three curves. At a jump, from left to right, the curve starts to level off, and then abruptly shoots up nearly vertically again, the slope changing quite suddenly. There must be some significance to these jumps. Perhaps the algorithm has suddenly discovered a particularly effective heuristic for evaluating board positions, or the algorithm actually is developing something like human "insight" or "intuition" at these jumps.

forestpepper
Автор

And at this point Stockfish resigned the game

richiester
Автор

It's often said RL without search. But there's always a search tree.

vegahimsa
Автор

Astonishing games from Alpha zero! Stockfish calculates 80mln positions per second. Alpha zero 70, 000. Human champion Carlsen probably can do 7. Human intuition is 10, 000x better than AI, but the amazing part is that AI intuition is 1000x better than a brute force approach. It seems that AI is about halfway there. BTW all 3 players are not equal, and Stockfish would probably need 10 or 100x increase in speed if it wasn't equipped with table bases, heuristics, openings, etc. How many years will it take for the 2nd half of the road to AGI? For starters, how long did it take for the first half?

peterpetrov
Автор

Thought he said he had automated several talks. I thought, man that would be super impressive.

Wemdiculous
Автор

I'm waiting for the day when AI will be able to design new games from scratch instead of just learning how to play already existing ones.

robostain_
Автор

To tell you the truth my friends, I'm more afraid of this technology than I'm fascinated in it. Greetings! ;)

basteqss
Автор

I was high af watching this and I could only focus on this guy saying “uuuum”

asink
Автор

Just a thought. Has it been contemplated what would happen if we could teach AlphaGoZero to teach humans play Go? How would that develop and what kind of players (who had never played GO) would that produce. And what would happen the day you put a traditional human player up against an AlphaGoZero player. I find very interesting.

PaytonTroy
Автор

I hope they release the remaining 90 games of alpha zero and stockfish 8

bunnygummybear
Автор

Yea, but can it perform on a cold wet night in Stoke....

duskie
Автор

Delusions (8:20) Reinforcement: That's the magic part. Pay attention.

visit shbcf.ru