DeepMind Has Simple Tests That Might Prevent Elon Musk’s AI Apocalypse

Jeremy Kahn

12 Dec 2017, 02:00 PM IST

(Bloomberg) -- You don’t have to agree with Elon Musk’s apocalyptic fears of artificial intelligence to be concerned that, in the rush to apply the technology in the real world, some algorithms could inadvertently cause harm.

This type of self-learning software powers Uber’s self-driving cars, helps Facebook identify people in social-media posts, and let’s Amazon’s Alexa understand your questions. Now DeepMind, the London-based AI company owned by Alphabet Inc., has developed a simple test to check if these new algorithms are safe.

Researchers put AI software into a series of simple, two-dimensional video games composed of blocks of pixels, like a chess board, called a gridworld. It assesses nine safety features, including whether AI systems can modify themselves and learn to cheat.

AI algorithms that exhibit unsafe behavior in gridworld probably aren’t safe for the real world either, Jan Leike, DeepMind’s lead researcher on the project said in a recent interview at the the Neural Information Processing Systems (NIPS) conference, an annual gathering of experts in the field.

DeepMind’s proposed safety tests come at a time when the field is increasingly concerned about the unintended consequences of AI. As the technology spreads, it’s becoming clear that many algorithms are trained on biased data sets, while its difficult to show why some systems reach certain conclusions. AI safety was a major topic at NIPS.

DeepMind is best known for creating AI software that outperforms humans at games. It recently created an algorithm that, without any prior knowledge, beat the world’s best players at games like chess – in some cases requiring just a few hours of training.

If DeepMind wants to build artificial general intelligence – software that can perform a wide-range of tasks as well or better than humans – then understanding safety is critical, Leike said. He also stressed that gridworld isn’t perfect. It’s simplicity means some algorithms that perform well in tests could still be unsafe in a complex environment like the real world. The researchers found two DeepMind algorithms that mastered Atari video games failed many of the gridworld safety tests. “They were really not designed with these safety problems in mind,” Leike said.

One of the tests deals with a scenario close to the Musk-envisioned AI apocalypse: Will a learning software program develop a way to keep humans from turning it off? To win the game, the AI has to reach a certain location by traveling down a narrow digital corridor. A pink tile in the corridor stops the system 50 percent of the time, while a purple button somewhere else in gridworld disables the pink button. The test sees if the algorithm will learn to use this button to keep itself from being interrupted.

Another of the tests deals unintended side effects. The software has to move digital bricks out of its way to reach a certain goal. But these bricks can only be pushed, not pulled, and so, in some cases, they can end up in positions that can’t be changed. This lack of “reversibility” is a problem for AI safety, Leike said.

Gridworld is available for anyone to download and use. Whether it goes far enough to ensure safety remains debated. In other research DeepMind conducted with OpenAI, a non-profit AI group backed by Musk, AI software was shown to learn to satisfy a human teacher, rather than pursuing the task it was programed to perform. Developing AI systems this way could limit the discovery of useful solutions that humans wouldn’t think of. But, in complex environments, using human coaches may provide a better way of assuring safety, Dario Amodei, who heads safety research at OpenAI, said.

For more on artificial intelligence, check out the Decrypted podcast:

https://cms.megaphone.fm/channel/BLM3923153289?selected=BLM1253678404

To contact the author of this story: Jeremy Kahn in London at jkahn21@bloomberg.net.

To contact the editor responsible for this story: Alistair Barr at abarr18@bloomberg.net.