Your high score is about to be trounced. Google has developed artificial intelligence software capable of learning to play video games just by watching them.
Google DeepMind, a London-based subsidiary, has trained an AI gamer to play 49 different video games from an Atari 2600, beating a professional human player’s top score in 23 of them. The software isn’t told the rules of the game – instead it uses an algorithm called a deep neural network to examine the state of the game and figure out which actions produce the highest total score.
“It really is the first algorithm that can match human performance across a wide range of challenging tasks,” says DeepMind co-founder Demis Hassabis.
Advertisement
Deep neural networks are often used for image recognition problems, but DeepMind combined theirs with another technique called reinforcement learning, which rewards the system for taking certain actions, just as a human player is rewarded with a higher score when playing a video game correctly.
The software performed best on simple pinball and boxing games, but also scored highly on the arcade classic Breakout, which involves bouncing a ball to clear rows of blocks. It even managed to learn to tunnel through one column of bricks and bounce the ball off the back wall, a trick that seasoned Breakout players use.
“That was a big surprise for us,” says Hassabis. “The strategy completely emerged from the underlying system.”
Neural networks have been playing games like backgammon for decades, says Jürgen Schmidhuber of the Dalle Molle Institute for Artificial Intelligence Research in Manno, Switzerland. The difference here is that advances in computing power mean that AI systems can handle learning much larger data sets. Watching an Atari game is the equivalent of processing about 2 million pixels of data a second.
That suggests that Google is interested in using its AI to analyse its own large data sets. “We can’t say anything publicly about this but the system is useful for any sequential decision making task,” says Hassabis. “You can imagine there are all sorts of things that fit into that description.”
Google’s core business of serving up ads is easily translatable into this framework, says Schmidhuber. The pixels of the game are analogous to the vast amounts of data Google has on each user, and the score becomes their ad revenue. “You can use reinforcement learning methods to improve ad quality,” he says. “You learn to place ads that are more likely to be clicked on, which means higher rewards. This is presumably one of their motivations.”
Michael Cook of Goldsmiths, University of London, says that Google is already using DeepMind technology in seven of its products, according to a recent talk by one of the team. “It’s anyone’s guess what they are, but DeepMind’s focus on learning through actually watching the screen, rather than being fed data from the game code, suggests to me that they’re interested in image and video data,” he says.
That could be useful for Google’s autonomous cars, or maybe even more long-term projects like teaching AI’s to understand concepts like red, rather than hard facts. “There’s a part of me that hopes that Google views DeepMind as an opportunity to simply research something for the fun of it, monetisation be damned,” says Cook.
Journal reference: Nature, DOI: 10.1038/nature14236
Topics: