This week, Google subsidiary Deep Mind released a paper describing how its AI, AlphaZero, thrashed the strong chess program Stockfish8, in a 100-game match by 64-36 (28 wins, no losses). AlphaGo, a predecessor of AlphaZero, beat two human world champions at Go in 2016. “Zero” used a new learning method to beat AlphaGo in 2017.
Go is way more complex than chess. Stones are placed on the intersections of a 19x19 board in a territorial game. Chess has 20 possible opening moves, Go has 361. Games can often last 150 moves. Number crunching is impossible; Go has to be played by rule of thumb (heuristic) strategies.
AlphaGo learnt to play on an array of 48 special processors. It analysed millions of human games and humans taught it known heuristics and patterns. AlphaZero was just taught the rules and told to play itself on an array of just four processors. In a few days, it derived all known Go heuristics, and found new ones.
Those methods adapted easily to chess. “Zero” was taught the rules. It taught itself heuristics, opening systems and combinative patterns over 24 hours. Stockfish is a much better number-cruncher. The Stockfish that played Zero evaluated 70 million positions per second. Zero crunched “only” 80,000/second. Sceptics will note: Stockfish’s opening book was removed, and an unusual time control of 1 min/move was used.
Chess programs use number-crunching combined with search-evaluation (“alpha-beta”) to select moves. They run all possible moves to pick the ones with the best evaluations. Given a threat of checkmate that can be parried by only one defence, an alpha-beta program still examines all moves. Zero is more “human” but it has transcended humanity in its understanding.
The Go breakthrough resulted in two interesting developments. Other Go programs (notably Crazy Stones) using similar methods became stronger. Humans also learnt better heuristics. This will happen in chess for sure.
The diagram, WHITE TO PLAY, (White: AlphaZero Vs Black: Stockfish, Game 9 of 10 released) , is an example of Deep Understanding”. White played 30. Bxg6!! Bxg5 31. Qxg5 fxg6 32. f5 Rg8 [32.— gxf5 33. Qg7+; 32. — exf5 33. Qf6 Qf8 34. Qxb6 are killers] My stockfish thinks this is equal at a depth of 16 moves. It’s not. The bishop is a crippled target.
White controls c-file, there are mate threats, the rook hangs in some lines.