AlphaZero: DeepMind’s AI Works Smarter, not Harder

preview_player
Показать описание
Errata: regarding the comment on the rules - the AI has no built-in domain knowledge but the basic rules of the game.

📝 The paper "AlphaZero: Shedding new light on the grand games of chess, shogi and Go" is available here:

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
313V, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Anthony Vdovitchenko, Brian Gilman, Christian Ahlin, Christoph Jadanowski, Claudio Fernandes, Dennis Abts, Eric Haddad, Eric Martel, Evan Breznyik, Geronimo Moralez, Jason Rollins, Javier Bustamante, John De Witt, Kaiesh Vohra, Kasia Hayden, Kjartan Olason, Lorin Atzberger, Marcin Dukaczewski, Marten Rauschenberg, Maurits van Mastrigt, Michael Albrecht, Michael Jensen, Morten Punnerud Engelstad, Nader Shakerin, Owen Campbell-Moore, Owen Skarpness, Raul Araújo da Silva, Richard Reis, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Thomas Krcmar, Torsten Reil, Zach Boldyga, Zach Doty.

Károly Zsolnai-Fehér's links:

#DeepMind #AlphaZero
Рекомендации по теме
Комментарии
Автор

I had to reupload this video because the previous one contained an incorrect statement. This is the fixed version.

TwoMinutePapers
Автор

It has to be added that "only trained for 4 hours" is kind of a missleading statement as several thousand tensor processing units were used for the training (think like the best GPU's on the market but optimized for machine learning calculations). This means that trying to train the network with more standard equipment like GTX 1080's, or even Tesla V100's, will take much longer than 4 hours.

dkkoala
Автор

It's worth noting that yes, AlphaZero trains very quickly in absolute terms, but this is using 4 very powerful TPUs and 44 CPU cores. It then had the same compute power while playing the game, whereas Stockfish did not have any TPU or even GPU. Not to detract from their victory, but I think that's an important detail especially when these competitions are done under time control and more compute can obviously provide an advantage

MobyMotion
Автор

AS a chess enthusiast, I came across these papers last week. What a read. And THANK YOU for not making these awsome videos only 2 minutes long :D

karim
Автор

"Give it a go" I see what you did there

danielemessina
Автор

Really impressive. The AI research is going at incredible speeds.

Macieks
Автор

“The match results by themselves are not particularly meaningful because of the rather strange choice of time controls and Stockfish parameter settings: The games were played at a fixed time of 1 minute/move, which means that Stockfish has no use of its time management heuristics (lot of effort has been put into making Stockfish identify critical points in the game and decide when to spend some extra time on a move; at a fixed time per move, the strength will suffer significantly). The version of Stockfish used is one year old, was playing with far more search threads than has ever received any significant amount of testing, and had way too small hash tables for the number of threads. I believe the percentage of draws would have been much higher in a match with more normal conditions.”



Tord Romstad, one of the leaing Developers of Stockfish

karls
Автор

it happened that am preparing this amazing paper for a seminar !! always been a fan of your channel sir

amrohendawi
Автор

Do you feel that maybe the table at 2:49 is every so slightly misleading? I think it would have been fairer to add the comparison in operations per evaluations to the competing algorithms. There's nothing wrong with Alphazero being computationally more expensive, especially if the end result is better, but it should clearly be stated as such...

Also the training times skew public perception of just how much work goes into training and developing these systems.

I only say this because as someone who people rely on to gain knowledge, I think it's always important to be remove linguistic/statistical bias from any publications... easier said than done but one can only suggest and hope!

donkisiko
Автор

It wouldn't be a Two Minute Papers video without a: "Oh my goodnes, what a time to be alive!" 3:41

EduardCaliman
Автор

Year 2050 : AI started ruling earth
Host " oh my god what a time to be alive".

JohnCena-hujq
Автор

AlphaZero's achievement was impressive, although no one seemed to mention the importance of MCTS here. AlphaZero's tree searching was far more effective than standard A/B engines. See the recent Leela 16-4 victory over Stockfish in the TCEC blitz match played at a 12 3 time control. Especially in faster play, a well-built tree is most effective. The results are very good evidence of that. It was impressive that Stockfish beat Leela 10-9 in the TCEC SuperFinal for season 14 which ended a few weeks ago.

ErikKislikChessSuccess
Автор

Comparing the training time of Alphazero (on a pretty powerful machine) to the development of Stockfish is a bit weird. Alphazero also had some development time. Also the number of hours is pretty non-informative, compared to the number of test runs, or the number of hours combined with the exact machine specifications.

nilsp
Автор

Very interesting in the chess example. AlphaZero demonstrates the first move advantage by white in its competition with Stockfish.

alert_xss
Автор

Alpha Zero's learning time isn't as impressive in one sense, it required 5000 first gen TPUs generating games and 64 second gen TPU's learning from those games. A measure of the computing power is 23 teraFlops per 1st gen tpu and 45 teraFlops per second gen cpu that's a total of 25880 teraFlops, (is that called 26 petaFlops)?

Now traditional chess programs aren't using floating point calculations to evaluate chess, so the speed of that isn't so important, but a measure of a desktop machines' ability to calculate in the CPU part only might be 200 gigaflops and a measure of instructions per second might be 30000 MIPS (the first is higher because of vector processing instructions).

That means google's machines were between 100, 000 and a million times faster than an average computer.

So is 4 day's training time on Google's cluster might have been equivalent to up to 100 centuries of training time on a home computer that's just using the CPU and no GPU or TPU to calculate.

joshuascholar
Автор

I saw in the comment about Lc0. I also wanna point it that too. It's AlphaZero based network with the different topology what they say in their webpages. However, it doing something extraordinary with it's winning endgames, it seems, it is making some fun with its opponent instead of playing directly winning moves to finish, it prolongs the game intentionally with using the sidelines. It also trains itself playing with the human players too, I assume that is the reason. Worth to check a couple of games from TCEC games of it.

merkwur
Автор

The comment that Carlsen thinks about what AlphaZero would do during his games are taken completely out of context. He doesn't soul-search for AlphaZero like moves. He's known to dislike computer influence in his play as it's unreasonable to expect such accuracy and depth in calculations to make such style feasible. I'm also afraid as mentioned by others, the time of 4 hours to learn does not mean much if it takes years on my computer, whereas stockfish can play at 3200 right away. It's great progress; I'm strictly speaking of the message in the video.

ahilanpalarajah
Автор

It's really easy when the problem takes place in a well-defined constrained setting without any fuzzy variables.


Fortunately there are many such problems that exist=)

forgotaboutbre
Автор

I think that too many people are focusing on the game, which I also follow, as if this were an ordinary player. Since I have significant knowledge, and since I believe that Hawking and Musk were right, I am really anxious by the self-taught nature of this AI.

This particular AI is not the worrisome thing, albeit it has obvious, potential applications in military logistics, military strategy, etc. The really scary part is how fast this was developed after AlphaGO debuted.

We are not creeping up on the goal of human-level intelligence. We are likely to shoot past that goal amazingly soon without even realizing it, if things continue progressing as they have.

The first AIs will also be narrow and not very competent or threatening, even if they become "superhuman" in intelligence. They will also be harmless, idiot savants at first.

Upcoming Threat to Humanity.
The scary thing is the fact that computer speed (and thereby, probably eventually AI intelligence) doubles about every year, and will likely double faster when super-intelligent AIs start designing chips, working with quantum computers as co-processors, etc. How fast will our AIs progress to such levels that they become indispensable -- while their utility makes hopeless any attempts to regulate them or retroactively impose restrictions on beings that are smarter than their designers?

At first, they may have only base functions, like the reptilian portion of our brain. However, when will they act like Nile crocodiles and react to any threat with aggression? Ever gone skinny dipping with Nile crocodiles?

I fear that very soon, before we realize it, we will all be doing the equivalent of skinny dipping with Nile crocodiles, because of how fast AIs will develop by the time that the children born today reach their teens or middle age. Like crocodiles that are raised by humans, AIs may like us for a while. I sure hope that lasts. As the announcer in Jeopardy said about a program that was probably not really an advanced AI long ago, I, for one, welcome our future, AI overlords.

mim
Автор

In the future, how will we handle updating a program that a neural network has studied. I imagine it would be difficult to add new features and have the AI utilise them without some form of prompting. Would the network adapt to fully utilise the new skills it picks up or would the existing dominant strategy remain in effect. Would retraining from scratch or a 'save' point in its training be more efficient.

mrsuper
visit shbcf.ru