The poker-playing AI
Researchers recently announced that they'd developed a poker bot that had weakly solved heads-up limit hold'em poker. That is, essentially the poker bot is unbeatable in a fair game, luck notwithstanding.
But although bluffing looks like a very human, psychological element of the game, it’s not. You can calculate how to bluff optimally. “Bluffing falls out of the mathematics of the game”, says Bowling. If you’re dealt a jack, say, it is possible to figure out how often you should ideally bluff with it. Some of the early pioneers of game theory, such as John von Neumann, aimed to develop mathematical strategies for bluffing.
The real challenge for a poker algorithm is dealing with the immense number of possible ways the game can be played. Bowling and colleagues have looked at one of the most popular forms, called Texas Hold’em, in which the dealer deals two cards to each player (face down) and also a set of face-up “community cards”. Players can bet, raise or fold after each deal. With just two players, the game becomes Heads-up, and it is a “limit” game when it has fixed bet sizes and a fixed number of raises. There are 3.16×10^17 states that HULHE can reach, and 3.19×10^14 possible points where a player must make a decision.
The new algorithm involves calculating all possible decisions in advance, so that they can just be looked up as a game proceeds. This is done in a learning process: the strategy begins by making decisions randomly, and is then updated by experience as the algorithm attaches a “regret” value to each decision depending on how poorly it fared. It takes a little more than 1500 training rounds to make the program essentially invincible.
What caught my eye is how they solved the problem. One of the keys to dealing with the problem was actually just a data-compression solution.
The other crucial innovation was the handling of the vast amounts of information needing to be stored to develop and use the strategy – of the order of 262 terabytes. This volume of data demands disk storage, which is slow to access. The researchers figured out a data-compression method that reduces the volume to a more manageable 11 TB, and which adds only 5% to the computation time from the use of disk storage.
You can try playing this new poker bot online.
It's an impressive advance, but I wonder how long until they can solve multi-player no-limit Texas Hold'em. It's a much more complex game than heads-up limit hold'em.