On Tuesday, Meta AI announced the development of Cicero, which it says is the first AI to achieve human-level performance in the strategic board game. Diplomacy. This is a notable achievement as the game requires deep interpersonal negotiation skills, implying that Cicero achieved some command of the language needed to win the game.
Even before Deep Blue beat Garry Kasparov at chess in 1997, board games were a useful measure of AI success. In 2015, another barrier fell when AlphaGo defeated Go master Lee Sedol. Both of these games follow a relatively clear set of analytic rules (though Go’s rules are generally simplified for computer AI).
But with Diplomacy, much of the gameplay involves social skills. Players must empathize, use natural language, and build relationships to win, a difficult task for a computer gamer. With that in mind, Meta asked, “Can we create more efficient and flexible agents who can use language to negotiate, persuade, and work with people to achieve strategic goals similar to how humans do?”
According to Meta, the answer is yes. Cicero learned his skills by playing an online version of Diplomacy on webDiplomacy.net. Over time, he became a master of the game, reaching “more than double the average score” of human players and ranking in the top 10% of people who played more than one game.
To create Cicero, Meta gathered AI models for strategic reasoning (similar to AlphaGo) and natural language processing (similar to GPT-3) and bundled them into a single agent. During each game, Cicero examines the state of the game board and the chat history and predicts how the other players will act. He formulates a plan which he executes through a language model that can generate human-like dialogue, allowing him to coordinate with other players.
Meta calls Cicero’s natural language skills a “controllable dialogue pattern”, which is central to Cicero’s personality. Like GPT-3, Cicero draws from a large corpus of Internet text extracted from the web. “To build a controllable dialog model, we started with a 2.7 billion parameter BART-like language model pre-trained on text from the Internet and refined on over 40,000 human games on webDiplomacy.net “, writes Meta.
The resulting model mastered the intricacies of a complex game. “Cicero can deduce, for example, that later in the game he will need the support of a particular player,” Meta explains, “and then strategize how to curry favor with that person and even recognize the risks and the opportunities that player sees from their particular perspective.”
Meta’s research on Cicero has been published in the journal Science under the title “Playing at the Human Level in the Game of Diplomacy by Combining Linguistic Patterns with Strategic Reasoning.”
As for broader applications, Meta suggests his Cicero research could “loosen communication barriers” between humans and AI, such as maintaining a long-term conversation to teach someone a new skill. Or it could power a video game where NPCs can talk like humans, understand player motivations, and adapt along the way.
At the same time, this technology could be used to manipulate humans by impersonating people and deceiving them in potentially dangerous ways, depending on the context. In this sense, Meta hopes that other researchers can build on its code “responsibly”, and claims to have taken measures to detect and remove “toxic messages in this new domain”, which probably refer to the dialogue that Cicero learned from Internet texts. he ingested – always a risk for large language models.
Meta provided a detailed site to explain how Cicero works and also opened Cicero’s code on GitHub. On line Diplomacy fans – and maybe even the rest of us – may have to be careful.
#Meta #Researchers #Create #Masters #Diplomacy #Fooling #Human #Players