CICERO: An AI agent that negotiates, persuades, and cooperates with people

preview_player
Показать описание
#ai #cicero #diplomacy

A team from Meta AI has developed Cicero, an agent that can play the game Diplomacy, in which players have to communicate via chat messages to coordinate and plan into the future.

Paper Title: Human-level play in the game of Diplomacy by combining language models with strategic reasoning

OUTLINE:
0:00 - Introduction
9:50 - AI in cooperation games
13:50 - Cicero agent overview
25:00 - A controllable dialogue model
36:50 - Dialogue-conditional strategic planning
49:00 - Message filtering
53:45 - Cicero's play against humans
55:15 - More examples & discussion

Abstract:
Despite much progress in training AI systems to imitate human language, building agents that use language to communicate intentionally with humans in interactive environments remains a major challenge. We introduce Cicero, the first AI agent to achieve human-level performance in Diplomacy, a strategy game involving both cooperation and competition that emphasizes natural language negotiation and tactical coordination between seven players. Cicero integrates a language model with planning and reinforcement learning algorithms by inferring players' beliefs and intentions from its conversations and generating dialogue in pursuit of its plans. Across 40 games of an anonymous online Diplomacy league, Cicero achieved more than double the average score of the human players and ranked in the top 10% of participants who played more than one game.

Authors: Anton Bakhtin, Noam Brown, Emily Dinan, Gabriele Farina, Colin Flaherty, Daniel Fried, Andrew Goff, Jonathan Gray, Hengyuan Hu, Athul Paul Jacob, Mojtaba Komeili, Karthik Konath, Minae Kwon, Adam Lerer, Mike Lewis, Alexander H. Miller, Sasha Mitts, Adithya Renduchintala, Stephen Roller, Dirk Rowe, Weiyan Shi, Joe Spisak, Alexander Wei, David Wu, Hugh Zhang, Markus Zijlstra

Links:

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
Рекомендации по теме
Комментарии
Автор

OUTLINE:
0:00 - Introduction
9:50 - AI in cooperation games
13:50 - Cicero agent overview
25:00 - A controllable dialogue model
36:50 - Dialogue-conditional strategic planning
49:00 - Message filtering
53:45 - Cicero's play against humans
55:15 - More examples & discussion



YannicKilcher
Автор

diplomacy is super cool game. You talk about 'optimal strategies' like in chess, but you gotta keep in mind that in this game if you grow too fast, other players gang on you. Optimal often means something very different in games involving multiple agents being able to stab each other. It's like when in 4 player version of chess grandmasters often find themselves losing, and people worse at calculation consistently win because just finding the 'best move' and expecting everyone else to follow along with it can be actually pretty devastating from risk management perspective, since other players might value different things more, like appearing non-threatening or eliminating strongest appearing player instead of maximizing return.

jamesantonisenior
Автор

Hit the nail on the head with all those takes, and great paper explain! Thanks again Yannic 👌

oncedidactic
Автор

"Trusting your opponent" actually does happen in chess, particularly if you're the higher rates player people will often assume your attack is sound and focus on how to defend rather than trying to straight up refute it when straight up refutation may actually be the best choice.

MrJaggy
Автор

in October of 1983, I chose as an epigraph some words of Gottfried Wilhelm von Leibniz(he loved Plato): "Let us calculate!" In Leibniz's Latin this exhortation was actually just one word: "Calculemus!" Leibniz was an optimist—he was the model of Voltaire's Dr. Pangloss—and he saw a bright future for what we would now call algorithmic thinking. Calculation would be the key to settling all human conflicts and disagreements, he believed.

Aristocle
Автор

What you call "tilt" sounds to me like fairly rational moves in an iterated prisoner's dilemma. The human professional who everyone knows "goes on tilt" is safer because of it!

charliesteiner
Автор

in the future diplomacy will be algorithmically generated

YUTPIA
Автор

I grab myself some food and come back to this drawing 1:22.
Excellent!

markusfassbinder
Автор

Very interesting research project. I even paused the video to have a read on the original paper first. I wonder when will a light-weighted version of similar system been deployed in actual game (server side or locally)

ec
Автор

Tilt isn't an emotional breakdown, tilt is a metagame. The AI doesn't care about the meta game. Humans love to leverage the meta game.

For example, if you know by my play history, that I am going to spend the rest of the game trying to make sure you lose, if you ever make a move against me, you might consider not making a move against me. From that position of reputation, I have gained an advantage.

Now of course, you might also decide to not invite me to game night anymore.

dialecticalmonist
Автор

I'm a little confused by your "human element" take.
It isn't possible to win diplomacy by yourself. You need cooperation and coordination.
So the idea that establishing trust wouldn't be a major factor for a bot seems off to me.
The likelihood of established trust should be a larger predictor of moves than a purely game theoretical aspect of the game.
Maybe I'm misinterpreting your take though
Great video regardless though :)

ThatFroKid
Автор

Disappointing under the hood, but I am wondering if we do not miss something because it's unclear how a dialogue like the negociation at 55:17 could take place with such an architecture.

PasseScience
Автор

If you're trying to say Trust doesn't make sense in diplomacy because it doesn't make sense in chess. I feel like you haven't played diplomacy reputation and trust is a huge deal. A strong alliance can easily crush the other players unless the other players put aside their differences to stop them. Knowing when to stop playing for yourself and stop the board leader from soloing by cooperating is important. Making sure other players know a stab will be punished is important. Etc

solsystem
Автор

we just gonna... great intro btw really evocative

andrewferguson
Автор

I believe in most games the difference between average player and professional is quite large. So being in the top 10% is like half-way there and you will still be destroyed by a pro every time.
e.g. chess percentiles 50th: ~1000, 10th ~1500, pro 2500+; dota percentiles 50th: ~2300, 10th ~4200, pro 10k+; so difference between 10th percentile and professional is 2-3x times bigger than difference between average and 10th percentile.
Also, the more difficult/complex a game is, the bigger the difference in skill will be and it would require more to become best at it.

Rizhiy
Автор

In natural language processing a good challenge could be to build Magic the gathering decks and play them successfully.

PasseScience
Автор

I think that the next time you cover new models you should also try to answer the question: "what does it want?" :D

ArcaLuiNeo
Автор

0:00 to 1:30
Johnson: [Noticing Dr. Evil's spaceship on radar] Colonel, you better have a look at this radar.
Colonel: What is it, son?
Johnson: I don't know, sir, but it looks like a giant--
Jet Pilot: Dick.
Dick: Yeah?
Jet Pilot: Take a look out of starboard.
Dick: Oh my God, it looks like a huge--
Bird-Watching Woman: Pecker.
Bird-Watching Man: [raising binoculars] Ooh, Where?
Bird-Watching Woman: Wait, that's not a woodpecker, it looks like someone's--
Army Sergeant: Privates! We have reports of an unidentified flying object. It has a long, smooth shaft, complete with--
Baseball Umpire: Two balls.
[looking up from game]
Baseball Umpire: What is that. It looks just like an enormous--
Chinese Teacher: Wang, pay attention!
Wang: I was distracted by that giant flying--
Musician: Willie.
Willie Nelson: Yeah?
Musician: What's that?
Willie Nelson: [squints] Well, that looks like a giant--
Colonel: Johnson?!
Johnson: Yes, sir?
Colonel: Get on the horn to British Intelligence and let them know about this.

impromptu_ninja
Автор

Is there already a video of yours of the recent AI from DeepMind which also plays diplomacy and was published in nature communications in December?

nicohambauer
Автор

you know what would be cool? there is a social text based deduction game called "untrusted" with a small community. could you try and implement an AI that is unnoticed? it would have to try to win as its faction while making people believe its "netsec" (the "good" side thats trying to "hack" a server)

Edit: every round is also logged. every private message and chat/agent chat message from every round on their site. PERFECT for fine tuning

zero.Identity