This Man Is the Godfather the AI Community Wants to Forget

Ashlee Vance

16 May 2018, 07:00 AM IST

(Bloomberg Businessweek) -- Many of the biggest names in the technology industry are consumed with developing an artificial general intelligence, or AGI. Unlike today’s leading artificial intelligence software, an AGI wouldn’t need flesh-and-blood trainers to figure out how to translate English to Mandarin or spot tumors in an X-ray. In theory, it would have some measure of independence from its creators, solve complex, novel problems on its own, and herald an era in which humankind is no longer superior to machines.

The consensus among our pitiful fleshbrains is that if humans ever manage to create an AGI, it’ll arise in Mountain View, Calif., Beijing, or Moscow. All three cities are near world-class AI research universities and are home to companies that have pumped billions into the AGI race. There exists, however, a chance that the breakthrough will come from the Swiss city of Lugano. Yes, Lugano.

The picturesque slice of Switzerland’s southern tip is home to about 60,000 people, including a computer scientist named Jürgen Schmidhuber. He’s a professor, a researcher, and the co-founder of a 25-employee AI startup called Nnaisense. (Pronounced like “nascence,” the name is proof that Silicon Valley holds no monopoly on ridiculous company names.) Schmidhuber is a pioneer who effectively figured out how to give AI systems memories. His ideas appear in one form or another in just about every smartphone, social network, and digital assistant. He’s not shy to mention these things, or to cite reams of documentation to back himself up, or to say things like, “My team plans to change the course of human history,” in between bites of salmon lasagna at a Lugano cafe.

For decades, Schmidhuber and a handful of other AI savants have pursued the quest for an AGI along similar paths, but only in the past six years has the right mix of powerful computers and plentiful data existed to start turning their theories into reality. The others—among them Geoffrey Hinton, Yoshua Bengio, Richard Sutton, and Yann LeCun—have become celebrities in the tech industry. They’re beloved as mentors, sought out by top companies, and feted at conferences as progenitors of a new age. Outside most academic circles, Schmidhuber remains largely unknown. Partly, that’s because of Lugano’s isolation in the Alps. Mostly, it seems to be because the guy’s peers don’t like him. While they shy away from commenting in public, the other AI legends describe him privately as egomaniacal, deceptive, and an overall pain.

Schmidhuber has a history of, among other things, haranguing fellow researchers in academic journals and at conferences, interrupting speeches to demand that peers admit they’ve borrowed or even stolen his ideas. These confrontations have become legendary for their unintentional comedy and aggression. They’ve also occurred frequently enough to become a term of art: One gets Schmidhubered.

Often, he’s right. But the harder he fights for recognition, the less seriously people seem to take him. “It’s sad,” says Moritz Müller-Freitag, chief operating officer of Twenty Billion Neurons GmbH, a startup teaching computers to analyze video footage. “It’s sort of self-defeating at this point.” Much of the AI community has decided to ignore Schmidhuber and hope he goes away.

That seems unlikely. Schmidhuber is reasonably sure he has the fate of the human species pretty well figured out. So be it if Google, Baidu, and Amazon.com have billions of dollars and thousands of people at their disposal. “I think it will only take a small team like ours to make an AGI,” he says. “We have lots of the basic puzzle pieces already.”

This Man Is the Godfather the AI Community Wants to Forget

Schmidhuber’s dreams of an AGI began in Bavaria. The middle-class son of an architect and a teacher, he grew up worshipping Einstein and aspired to go a step further. “As a teenager, I realized that the grandest thing that one could do as a human is to build something that learns to become smarter than a human,” he says while downing a latte. “Physics is such a fundamental thing, because it’s about the nature of the world and how the world works, but there is one more thing that you can do, which is build a better physicist.”

This goal has been Schmidhuber’s all-consuming obsession for four decades. His younger brother, Christof, remembers taking long family drives through the Alps with Jürgen philosophizing away in the back seat. “He told me that you can build intelligent robots that are smarter than we are,” Christof says. “He also said that you could rebuild a brain atom by atom, and that you could do it using copper wires instead of our slow neurons as the connections. Intuitively, I rebelled against this idea that a manufactured brain could mimic a human’s feelings and free will. But eventually, I realized he was right.” Christof went on to work as a researcher in nuclear physics before settling into a career in finance.

After high school, Schmidhuber studied computer science and mathematics and served in the West German army for 15 months, starting in 1981. He didn’t get along with his commanders. “Jürgen doesn’t like to be bossed around, especially if it means doing something he considers useless,” says Christof. At university, he earned a doctorate in computer science and published his earliest papers on AI and neural networks, the mix of hardware and software meant to mimic the brain’s structure of interlinking neurons. His online résumé meticulously catalogs his journey through academia, including notes such as “Declined postdoc offer by Caltech” and that he completed his bachelor’s and master’s degrees in four years when it usually takes “6.05.”

Schmidhuber owes his 23 years amid Lugano’s lakeside boats and spectacular mountain homes to Angelo Dalle Molle, an Italian liqueur importer. Dalle Molle created Cynar, a popular aperitif produced from artichokes, and made a fortune from it. He also dreamed of building a utopia with intelligent machine labor, so in 1988 he donated millions of dollars to create Lugano’s Dalle Molle Institute for Artificial Intelligence Research, or Idsia (the acronym’s in Italian). Idsia made Schmidhuber an early hire, and its partnerships with local universities, along with a steady stream of government funding, have helped turn the town into something of an AI hub in paradise, one that’s churned out a series of discoveries.

Schmidhuber teaches a course each year at Università della Svizzera Italiana in the heart of Lugano. He also works at an Idsia lab on the outskirts of town, wedged between a gas station, a highway, and hilly farms where horses, donkeys, and goats roam. And in 2014, Schmidhuber and four former students set up Nnaisense a few blocks from the university, aiming to pursue commercial partnerships in manufacturing, health, and finance, as well as pure research.

At 55, Schmidhuber is tall and trim, with a well-manicured salt-and-pepper goatee. He likes to wear black or white from head to toe, including a Nehru blazer and a driving cap. His lyrical Munich accent, punctuated by abrupt changes in cadence and tone, calls to mind Christoph Waltz’s Colonel Hans Landa in Inglourious Basterds. You’re compelled to pay close attention to his mix of poetry and severity, because it usually feels like something big and ominous is coming.

Like his more celebrated AI peers, Schmidhuber spent most of his life on the fringes of computer science. He was convinced that the best chance of true cyberconsciousness lay with neural networks. They were a fashionable idea in the 1950s, but technological limitations kept them in the realm of fantasy into the 21st century. The true believers who kept working on neural net theory left their colleagues wondering how such smart people could make such poor life choices.

Then a funny thing happened. By the mid-2000s, the internet was spewing out the kinds of massive data sets needed to train neural networks, and the speedy graphics chips running the latest video games turned out to be perfect for crunching that data. Schmidhuber and his fellow holdouts began to see their algorithms solving problems better than conventional programming techniques. In 2012 neural nets became good at recognizing images and speech, then at an ever-wider range of tasks. Today AI software incorporating these advances runs in all of our offices, homes, and pockets. (Hey, Siri and Alexa.) Companies including Google, Amazon, Facebook, Baidu, and Microsoft have bet their futures on further leaps in the technology.

It’d be tough to eyeball all the AI research papers published even in the past six years, but the past several decades have produced only a handful of must-reads. One of the first vertebrae in the field’s backbone dates to 1986, when Hinton wrote about backpropagation (“backprop” to the cool kids), a way to fine-tune neural networks by ranking the importance of the data received. In 1989, LeCun wrote another biggie, describing convolution neural networks. These break down problems that are complex for computers, such as finding a cat in a photo, into small chunks. A third breakthrough, in 1997, was Schmidhuber’s. He called it long short-term memory, or LSTM. “You can write it down in five lines of code,” he says, as if talking about a chili recipe.

Then Schmidhuber launches into a long explanation of the underlying theory. He begins by charting the intellectual course that led to his conclusions, citing research done by Russian mathematician Alexey Ivakhnenko in 1965 and Finnish mathematician Seppo Linnainmaa in 1970. Next, he explains LSTM in more concrete terms. As a neural network performs millions of calculations, the LSTM code looks for interesting findings and correlations. It adds temporal context to the data analysis, remembering what has come before and drawing conclusions about how that applies to the neural net’s latest findings. This level of sophistication makes it possible for the AI to start building its conclusions into a broader system—teaching itself the nuances of language, say, based on large volumes of text (that, for example, “each” takes a singular verb when it’s a subject).

Schmidhuber likens this AI training to the way a human brain filters big moments into long-term memory while leaving the more quotidian ones to dissolve. “It can learn to put the important stuff in memory and ignore the unimportant stuff,” he says. “LSTM can excel at many really important things in today’s world, most famously speech recognition and language translation but also image captioning, where you see an image and then you write out words which explain what you see.”

These powers make LSTM arguably the most commercial AI achievement, used for everything from predicting diseases to composing music. And it’s just one of the dozens of discoveries Schmidhuber’s website helpfully lists with detailed documentation about their origins and impact. “He’s had so many tremendous contributions,” says Müller-Freitag. “He has, in many ways, been way ahead of his time.”

The most prestigious AI conference goes by the unfortunate acronym of NIPS, or Neural Information Processing Systems. It began in 1987 as a fairly informal meetup among a few hundred die-hards, and over the past few years it’s grown from 1,000 attendees to more than 6,000. NIPS is the place where the AI superstars show off their latest and greatest work. It’s also the prime spot to be Schmidhubered.

At the 2016 NIPS event in Barcelona, a rising star named Ian Goodfellow dug in for a two-hour presentation on “generative adversarial networks.” Goodfellow, a research scientist at Google, had pioneered a way to speed the problem-solving of neural networks by pushing them to compete. Before the talk started, he stood with head bowed and hands clasped as he was introduced as “quite simply one of the most creative and influential researchers in our community today.” With glasses and a bowl cut, Goodfellow took his place shyly behind the podium and began to speak, his cheeks still red from the effulgent praise.

Everything went fine for an hour, as Goodfellow churned through a slide deck full of equations and AI intricacies. He’d just started talking about something called noise contrastive estimation when an all-too-familiar German voice arose from the audience: “Can I ask a question?”

“You have this nice slide there,” Schmidhuber began, while Goodfellow locked eyes with him. Schmidhuber charted a history of adversarial networks dating to 1992, highlighting several ties between his research and Goodfellow’s work, and spoke for close to three minutes before asking, “I was wondering whether you have comments on the similarities and differences of these old adversarial networks?” It was another way of saying, Hey, kid, you didn’t invent this.

Goodfellow’s death stare broke into a small, exasperated grin. “He is, in fact, aware of my opinion because we have corresponded about this by email,” Goodfellow told the crowd. “And I don’t exactly appreciate the public confrontation.” A large chunk of the audience applauded the younger man. As the clapping died down, he said he didn’t think the past work was terribly similar to his and that he’d said so in a recent paper, which Schmidhuber already knew.

Schmidhuber wasn’t done. “Just for completeness, however,” he butted in again and went on for a while trying to undercut Goodfellow’s response. Goodfellow’s patience ebbed. “I would prefer to use my tutorial to teach about generative adversarial networks,” he said. Another round of applause. Schmidhuber tried one more time, but Goodfellow ignored the plea and dove back into the speech’s second hour.

“Ian is a total genius in this world, and Jürgen basically stood up and said, ‘This is not such an interesting idea. We thought about it years ago,’ ” says Kory Mathewson, an AI researcher at the University of Alberta. He’s witnessed a few Schmidhuberings firsthand and says they’ve become almost a rite of passage in some corners. “At this point, young researchers might aspire to be Schmidhubered one day.”

Schmidhuber’s desire to redistribute credit extends beyond his own work. His website is full of what look like antiquated Pinterest boards charting discoveries on topics ranging from self-driving cars to the so-called theory of everything, the idea that mathematics can correctly describe the universe. Where most historians tend to recognize Brits such as Charles Babbage and Alan Turing as the fathers of modern computing, Schmidhuber begins with German engineer Konrad Zuse.

“Whenever I see that there is somebody who did something important and he didn’t get the credit for that, and somebody else claims another person did that first, then I am the first to write the little note to Nature or to Science or whatever and put that straight,” he says. “You can prove through timestamps who did what first. Everything else is at best a reinvention and, in the worst case, plagiarism.”

Many people in the AI field contend that Schmidhuber overvalues theory and undervalues practical applications. In one infamous squabble that started in Nature and spilled over to internet message boards, Schmidhuber took on all his fellow AI godfathers at once, accusing them of twisting AI’s history to erase his and others’ original ideas. LeCun, one of the aggrieved, replied to say it would be “pointless” to rebut the claims one by one—and that many people had also taken his own ideas. “But you don’t see me complain about it,” LeCun wrote. “That’s how science and technology make progress.”

The kinds of researchers who show up to NIPS pride themselves on this spirit of community and fair play. Still, some feel Schmidhuber has been unfairly written out of history simply because he offends. “We shouldn’t discredit someone’s work just because of their personality,” Mathewson says. “Half of science is communicating the science, and he has worked hard at that, even if he doesn’t do it in the most conventional way.”

Schmidhuber is well aware his behavior has hurt his standing and threatens to blunt the impact of his ideas. He asked that this story “delete anything in terms of competition with other researchers,” adding that he’s satisfied his work has found its way into the mainstream. “Everybody is using LSTM,” he says. “Everybody is citing it, so it’s good. Life is good.”

During a brisk, sunny February day, Schmidhuber heads out on a tour of his haunts. His Idsia lab looks boring from the outside, but the inside is full of experimental drones and humanoid robots. Behind his desk sit bookshelves lined with hundreds of volumes of Science and Nature, plus two supersize jugs of whey protein. Nearby, a person-size abacus acts as a reminder that the place used to be a schoolhouse.

Next, Schmidhuber breezes past the university where he teaches, then stops by Nnaisense. Many of the employees scattered around the office are his former students. He says paying them to take on ambitious projects at private rates is a way to keep talent in Lugano from seeking more lucrative corporate work at such places as DeepMind Technologies Ltd., London’s Google-owned AI leader. DeepMind co-founder Shane Legg is a former Schmidhuber protégé, a fact he mentions several times.

Nnaisense has received a few million dollars from Spanish investor Alma Mundi Ventures and works with a handful of companies. A set of three miniature cars in the office helped refine self-parking software for Audi AG. Nnaisense hopes to bankroll future research with an AI stock trader called Quantenstein, developed in partnership with a finance company. Schmidhuber says his team has also developed a powerful medical industry technology, but he wouldn’t name the customer or describe the system. “We may never disclose it,” he says.

The modest, quiet office doesn’t look like a buzzing AI superpower, but Schmidhuber’s team regularly manages to outmatch better-funded rivals such as DeepMind, Facebook, and Microsoft. At NIPS 2017, Nnaisense beat 441 other contestants to win the Learning to Run competition, which called for a simulation of an anthropomorphic AI to sprint through a virtual obstacle course as quickly as possible. Videos of the submissions initially resembled the late hours of a college frat party, as bodies stop, start, wobble, and fall in a heap, slowly learning from their mistakes. Nnaisense’s version looked downright athletic, running at 4.6 meters per second, which beat the 4.2 meters per second of a team from China. The victory earned Nnaisense a powerful $70,000 computer. “It was important to show what we could do,” says co-founder Jan Koutník.

Over lunch, Schmidhuber and his employees grouse about AI’s current focus on ads. “Most of the profit in AI today is in marketing,” Schmidhuber says. His team wants to branch out, building its AI’s ability to handle different tasks and take on more work. “The real money will be made when you have machines building machines and handling complex processes,” he says, returning to the ultimate goal of an AGI. “You want an AI that can make an iPhone.”

AGI is far from inevitable. At present, humans must do an incredible amount of handholding to get AI systems to work. Translations often stink, computers mistake hot dogs for dachshunds, and self-driving cars crash. Schmidhuber, though, sees an AGI as a matter of time. After a brief period in which the company with the best one piles up a great fortune, he says, the future of machine labor will reshape societies around the world.

“In the not-so-distant future, I will be able to talk to a little robot and teach it to do complicated things, such as assembling a smartphone just by show and tell, making T-shirts, and all these things that are currently done under slavelike conditions by poor kids in developing countries,” he says. “Humans are going to live longer, healthier, happier, and easier lives, because lots of jobs that are now demanding on humans are going to be replaced by machines. Then there will be trillions of different types of AIs and a rapidly changing, complex AI ecology expanding in a way where humans cannot even follow.”

Unlike a lot of people pitching AI products, Schmidhuber isn’t pretending that AI advances won’t eliminate jobs. Certain countries and groups will adapt better than others to the evaporating labor market, he says. The winners: Scandinavian countries with strong welfare systems (where “if you don’t have a job, you don’t die”); women (“much harder to replace than men, because they are general problem solvers”); and creative types (“if someone is an author who can evoke through the books the depth of the human experience, then there seems to be something valuable there”).

While Schmidhuber has a few of these more romantic moments, he’s as slavishly devoted to code as most of the godfathers of AI. They’re sure machines will eventually surpass us, and they revel in the efficiency and clarity this new world order will bring. Even the abstract sides of these men have been taken over by algorithms. Schmidhuber’s hobby is “low-complexity art,” which uses mathematical formulas to yield computer-generated images. Typically, these works render humans with just a handful of slashes on a screen. “It’s about extracting the essence of the art that you want to depict with the minimal amount of information,” he says.

Behind Schmidhuber’s conviction that an AGI must happen is his belief that it already has—that we’re living in a Matrix-style computer simulation. “That’s what I think, because it is the simplest explanation of everything,” he says. Humankind is programmed to chase progress, the theory goes, and will keep on making more powerful computers until we make ourselves obsolete or decide to merge with the smart machines. “Either you become something that’s really, really different from a human, or you stay as human for nostalgic reasons,” Schmidhuber predicts. “But then you will not be a major decision-maker. You will not play a role in shaping the world.”

As for why he feels the need to help bring this AGI about, he says it’s in his human nature. One must set the record straight, then advance the record, even if a few people or a few billion get upset along the way. “I am a result of this old deterministic but competitive process,” he says. “Basically, I can’t help it.”

To contact the editor responsible for this story: Bret Begun at bbegun@bloomberg.net, Jeff Muskus