AlphaZero, the game-playing AI created by Google sibling DeepMind, has beaten the world’s best chess-playing computer program, having taught itself how to play in under four hours.
The repurposed AI, which has repeatedly beaten the world’s best Go players as AlphaGo, has been generalised so that it can now learn other games. It took just four hours to learn the rules to chess before beating the world champion chess program, Stockfish 8, in a 100-game match up.
AlphaZero won or drew all 100 games, according to a non-peer-reviewed research paper published with Cornell University Library’s arXiv.
“Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved within 24 hours a superhuman level of play in the games of chess and shogi [a similar Japanese board game] as well as Go, and convincingly defeated a world-champion program in each case,” said the paper’s authors that include DeepMind founder Demis Hassabis, who was a child chess prodigy reaching master standard at the age of 13.
“It’s a remarkable achievement, even if we should have expected it after AlphaGo,” former world chess champion Garry Kasparov told Chess.com. “We have always assumed that chess required too much empirical knowledge for a machine to play so well from scratch, with no human knowledge added at all.”
Computer programs have been able to beat the best human chess players ever since IBM’s Deep Blue supercomputer defeated Kasparov on 12 May 1997.
DeepMind said the difference between AlphaZero and its competitors is that its machine-learning approach is given no human input apart from the basic rules of chess. The rest it works out by playing itself over and over with self-reinforced knowledge. The result, according to DeepMind, is that AlphaZero took an “arguably more human-like approach” to the search for moves, processing around 80,000 positions per second in chess compared to Stockfish 8’s 70m.
After winning 25 games of chess versus Stockfish 8 starting as white, with first-mover advantage, a further three starting with black and drawing a further 72 games, AlphaZero also learned shogi in two hours before beating the leading program Elmo in a 100-game matchup. AlphaZero won 90 games, lost eight and drew 2.
The new generalised AlphaZero was also able to beat the “super human” former version of itself AlphaGo at the Chinese game of Go after only eight-hours of self-training, winning 60 games and losing 40 games.
While experts said the results are impressive, and have potential across a wide-range of applications to complement human knowledge, professor Joanna Bryson, a computer scientist and AI researcher at the University of Bath, warned that it was “still a discrete task”.