For people who are interested in artificial intelligence, the past decade feels like another Renaissance — the boundaries between humans and machines are repeatedly re-defined by the invention of new technologies, from machines that could beat world Champions in games to AI assistants that could talk like real humans.
This article shares some stories behind this incredible AI Renaissance. The sources of the stories include my first-hand observation in the field as well as Cade Metz’s recent book Genius Makers, which I highly recommend.
A Sputnik Moment
A key reason for the current revolution is the re-invention of deep learning, a technology that simulates human brains as complex network architectures through computers.
The idea is not new. Scientists have been searching for the truth of human intelligence for a long time, and a natural starting point is our brain — the only intelligent machinery built by our mother nature. Artificial neural networks, predecessors of deep learning, were very popular between the 1950s and 1980s but lost their popularity because data was not enough and the computers then were too weak to solve any interesting problems.
It would take another two decades before its revival. In the late 2000s, a group of young scientists started to connect the power of the booming Internet with artificial intelligence research. In 2009, an assistance professor of Princeton Dr. Fei-fei Li1 compiled a vast database of Internet images (a.k.a. ImageNet dataset). The ImageNet dataset soon became the benchmark for Computer Vision, a subfield of artificial intelligence. In 2012, Geoffroy Hinton and his team significantly improved the metrics by more than ten percent, a jaw-dropping achievement that was a magnitude higher than any previous improvement.
This was a “Sputnik moment” for the artificial intelligence research community, at the time the mainstream research direction was for scientists to figure out the solution and to program the software based on the solution. Hinton’s success in the ImageNet challenge showed that an alternative approach — letting the neural networks learn a solution without prescription from humans — would work better.
Godfather is Heading to Industry
Huge amounts of data and computer resources were essential to this success. No one knew this better than Geoffroy Hinton himself, who is also known as the Godfather of deep learning. Hinton was one of the early persons that popularized back-propagation, a fundamental algorithm used to train neural networks. When the field entered into a winter between the 1990s and early 2000s, most researchers switched to other research directions due to scarce funding sources. However, Hinton was still a stubborn proponent of the idea and was trying to revive it.
Hinton knew that his research would need resources from elsewhere, and only the big Internet companies had the pocket deep enough and data big enough to make the idea work. In addition to his academic achievement, Hinton also had great business savvy. With his two students, Hinton founded the DNNResearch company in 2012 and soon decided to sell it to big Internet companies. The book Genius Makers gave a vivid description of how Hinton orchestrated the auction in a Lake Tahoe hotel and how tech companies all over the world wooed him. DNNResearch was eventually acquired by Google for 44 million US dollars.
More important than the price tag is the precedence that Hinton created. To lure Hinton, Google allowed him to keep his position on both sides — but he had to be an “intern” in Google to work around the company’s rules. Before Hinton, it was rare for eminent researchers to work for tech companies because of the fear of losing their tenured positions in universities. Soon after the purchase, a lot of AI researchers followed Hinton’s example to join technology companies, including Yann LeCun, another deep learning pioneer who later led Facebook’s AI lab, Andrew Ng, who led the research lab in Baidu. What’s more, following their advisors, students from various research labs flocked into big technology companies.
Among the technology companies, Google (and its parent company Alphabet) stood out for its unparalleled role in this wave of AI Renaissance. Its research divisions — Google Brain and DeepMind — are the driving force behind a lot of the greatest breakthroughs. What’s more, the fact that it could use AI to create so many profitable applications has a demonstration effect on all other companies.
One important person behind this is Jeff Dean, a legendary engineer that laid the foundation of Google’s infrastructure. In 2011, Andrew Ng introduced the deep learning concept to Jeff and he was intrigued immediately. Jeff was looking for his next application and deep learning was a perfect one. Andrew, Jeff, and another researcher Greg Corrado founded the Google Brain team. As a founding engineer of Google, Jeff has a great influence on Google’s management team and also enjoys enormous popularity within its engineering and research organizations (so much so that people made fun of him by creating the “Jeff Dean facts“). Jeff created an umbrella where the Google Brain team could operate without worrying about anything else.
In Google Brain, Andrew Ng and his colleagues helped create the system that could learn the “cat” concept from millions of YouTube videos, which drew a lot of media attention and publicized the field. Andrew Ng eventually left the Brain team to work on his own startup. But he recommended Geoffroy Hinton as his replacement, which triggered the DNNResearch acquisition. Under the leadership of Jeff and Hinton, Google Brain significantly contributed to the field by both pushing the research frontiers and publishing the TensorFlow framework that makes the technology accessible to outside communities.
New World Champion
One limitation of the techniques Hinton was trying (a.k.a. supervised deep learning) was that it requires datasets labeled by humans. Demis Hassabis co-founded DeepMind to address this limitation and explore other applications. He wanted to build a system that doesn’t depend on human supervision and could perform better than humans. A child prodigy in Chess, Hassabis believes that games are the best starting points. Although games had been a proving ground for AI since the 50s, no one has been more committed and successful than Hassabis in this direction.
Hassabis and his DeepMind team combined deep learning with reinforcement learning, a technology that allows computers to adapt their behaviors through trial and error (the same way we humans learn). With this new technology, DeepMind built a system that could learn the nuances that were never found by humans before in popular video games like Breakout and published their results in Nature. This publication drew the attention of Google’s management. In 2014, Google purchased DeepMind for more than $500M — this time both Hinton and Jeff are on the buyer side. With the resources from Google, DeepMind doubled down on its mission. In May 2017, DeepMind’s AlphaGo AI beat the world champion Ke Jie. Since then, it has kept beating humans in one field after another. In addition to DeepMind, Google’s other AI division also released the BERT system that significantly improved performance in natural language tasks.
There were also a lot of breakthroughs outside Google and DeepMind. For example, OpenAI, which was co-founded by some of Hinton’s students and Silicon Valley elites like Elon Musk and YC CEO Sam Altman, tackled many other games and robotics applications through reinforcement learning and they released the language models that achieved amazing results. The successes of Google/DeepMind/OpenAI and other AI research teams have brought the public interest in AI to an unprecedented level.
A keen observer would find that the current AI Renaissance consists of many small cycles. Each cycle starts when a difficult yet well-defined benchmark problem is solved. Thanks to the huge public attention, the research team that solved the problem would be able to get a huge amount of resources to continue their research. The team then tackles the next more challenging benchmark problem with a larger model. The cycles were started by academics and their students and were reinforced by big technology companies. People knew, either consciously or unconsciously, that it was the best way of attracting attention, funding, and talents.
Notwithstanding, ImageNet and Go games are still not real-world problems. In addition, there have been increasing concerns that this type of AI research pattern has caused enormous resource consumption and has made the AI models to be overly complex.For example, the GPT-3 language model related by OpenAI includes 175 billion parameters and each train takes around 4.6 million dollars. In addition, many AIs that overfit man-made tasks turn out to perform poorly in many real-world applications.
We should and would break such cycles. Building cost-effective AI and making it really work in real-world applications is crucial to keep the movement going. In the next decade, there will be a lot more exciting stories ahead of us.
Disclaim: All opinions are mine and not endorsed by my current or previous employers.
- Fei-fei Li was an assistant professor of Princeton University at the time but moved to Stanford later. The original version called Fei-fei Li a Stanford professor by mistake, thanks Jike Chong for pointing it out.