In a study published in the journal Science, researchers from the Carnegie Mellon University in the US detailed how their AI was able to achieve superhuman performance by breaking the game into computationally manageable parts and, based on its opponents' game play, fix potential weaknesses in its strategy during the competition.
AI programs have defeated top humans in checkers, chess and Go - all challenging games, but ones in which both players know the exact state of the game at all times.
In a 20-day competition involving 120,000 hands at Rivers Casino in Pittsburgh in January, Libratus became the first AI to defeat top human players at head's up no-limit Texas Hold'em Poker - the primary benchmark and long-standing challenge problem for imperfect-information game-solving by AIs.
Libratus beat each of the players individually in the two-player game and collectively amassed more than USD 1.8 million in chips.
"The techniques in Libratus do not use expert domain knowledge or human data and are not specific to poker. Thus they apply to a host of imperfect-information games," researchers said.
Libratus includes three main modules, the first of which computes an abstraction of the game that is smaller and easier to solve than by considering all possible decision points - about 10 multiplied 161 times - in the game.
It then creates its own detailed strategy for the early rounds of Texas Hold'em and a coarse strategy for the later rounds. This strategy is called the blueprint strategy.
In the final rounds of the game, a second module constructs a new, finer-grained abstraction based on the state of play.
The third module is designed to improve the blueprint strategy as competition proceeds. Typically, AIs use machine learning to find mistakes in the opponent's strategy and exploit them.
However, that also opens the AI to exploitation if the opponent shifts strategy, Sandholm said.
Instead, Libratus' self-improver module analyses opponents' bet sizes to detect potential holes in Libratus' blueprint strategy.
Libratus then added these missing decision branches, computes strategies for them, and adds them to the blueprint.
"The techniques that we developed are largely domain independent and can thus be applied to other strategic imperfect-information interactions, including non-recreational applications," researchers said.
"Due to the ubiquity of hidden information in real-world strategic interactions, we believe the paradigm introduced in Libratus will be critical to the future growth and widespread application of AI," they said.
Disclaimer: No Business Standard Journalist was involved in creation of this content
