After releasing what may well have been the most comprehensive report on the State of AI in 2019, Air Street Capital and RAAIS founder Nathan Benaich and AI angel investor, and UCL IIPP Visiting Professor Ian Hogarth are back for more.
In the State of AI Report 2020 released today, Benaich and Hogarth outdid themselves. While the structure and themes of the report remain mostly intact, its size has grown by nearly 30 percent. This is a lot, especially considering their 2019 AI report was already a 136 slide long journey on all things AI.
The State of AI Report 2020 is 177 slides long, and it covers technology breakthroughs and their capabilities, supply, demand and concentration of talent working in the field, large platforms, financing and areas of application for AI-driven innovation today and tomorrow, special sections on the politics of AI, and predictions for AI.
ZDNet caught up with Benaich and Hogarth to discuss their findings.
We set out by discussing the rationale for such a substantial contribution, which Benaich and Hogarth admitted to having taken up an extensive amount of their time. They mentioned their feeling is that their combined industry, research, investment and policy background and currently held positions give them a unique vantage point. Producing this report is their way of connecting the dots and giving something of value back to the AI ecosystem at large.
Coincidentally, Gartner's 2020 Hype cycle for AI was also released a couple of days back. Gartner identifies what it calls 2 megatrends that dominate the AI landscape in 2020 -- democratization and industrialization. Some of Benaich and Hogarth's findings were about the massive cost of training AI models, and the limited availability of research. This seems to contradict Gartner's position, or at least imply a different definition of democratization.
Benaich noted that there are different ways to look at democratization. One of them is the degree to which AI research is open and reproducible. As the duo's findings show, it is not: only 15% of AI research papers publish their code, and that has not changed much since 2016.
Hogarth added that traditionally AI as an academic field has had an open ethos, but the ongoing industry adoption is changing that. Companies are recruiting more and more researchers (another theme the report covers), and there is a clash of cultures going on as companies want to retain their IP. Notable organizations criticized for not publishing code include OpenAI and DeepMind:
"There's only so close you can get without a sort of major backlash. But at the same time, I think that data clearly indicates that they're certainly finding ways to be close when it's convenient", said Hogarth.
Industrialization of AI is under way, as open source MLOps tools help bring models to production
As far as industrialization goes, Benaich and Hogarth pointed towards their findings in terms of MLOps. MLOps, short for machine learning operations, is the equivalent of DevOps for ML models: taking them from development to production, and managing their lifecycle in terms of improvements, fixes, redeployments and so on.
Some of the more popular and fastest growing Github projects in 2020 are related to MLOps, the duo pointed out. Hogarth also added that for start up founders, for example, it's probably easier to get started with AI today than it was a few years ago, in terms of tool availability and infrastructure maturity. But there is a difference when it comes to training models like GPT3:
"If you wanted to start a sort of AGI research company today, the bar is probably higher in terms of the compute requirements. Particularly if you believe in the scale hypothesis, the idea of taking approaches like GPT3 and continuing to scale them up. That's going to be more and more expensive and less and less accessible to new entrants without large amounts of capital.
The other thing that organizations with very large amounts of capital can do is run lots of experiments and iterates in large experiments without having to worry too much about the cost of training. So there's a degree to which you can be more experimental with these large models if you have more capital.
Obviously, that slightly biases you towards these almost brute force approaches of just applying more scale, capital and data to the problem. But I think that if you buy the scaling hypothesis, then that's a fertile area of progress that shouldn't be dismissed just because it doesn't have deep intellectual insights at the heart of it".
This is another key finding of the report: huge models, large companies and massive training costs dominate the hottest area of AI today: NLP - Natural Language Processing. Based on variables released by Google et. al., research has estimated the cost of training NLP models at about $1 per 1000 parameters.
That means that a model such as OpenAI's GPT3, which has been hailed as the latest and greatest achievement in AI, could have cost tens of millions to train. Experts suggest the likely budget was $10M. That clearly shows that not everyone can aspire to produce something like GPT3. The question is, is there another way? Benaich and Hogarth think so, and have an example to showcase.
PolyAI is a London-based company active in voice assistants. They produced, and open sourced, a conversational AI model (technically, a pre-trained contextual re-ranker based on transformers) that outperforms Google's BERT model in conversational applications. PolyAI's model not only performs much better than Google's, but it required a fraction of the parameters to train, meaning also a fraction of the cost.
PolyAI managed to produce a machine learning language models that performs better than Google in a specific domain, at a fraction of the complexity and cost.
The obvious question is, how did PolyAI did it, as this could be inspiration for others, too. Benaich noted that the task of detecting intent and understanding what somebody on the phone is trying to accomplish by calling is solved in a much better way by treating this problem as what is called a contextual re-ranking problem:
"That is, given a kind of menu of potential options that a caller is trying to possibly accomplish based on our understanding of that domain, we can design a more appropriate model that can better learn customer intent from data than just trying to take a general purpose model -- in this case BERT.
BERT can do OK in various conversational applications, but just doesn't have kind of engineering guardrails or engineering nuances that can make it robust in a real world domain. To get models to work in production, you actually have to do more engineering than you have to do research. And almost by definition, engineering is not interesting to the majority of researchers".
Long story short: you know your domain better than anyone else. If you can document and make use of this knowledge, and have the engineering rigor required, you can do more with less. This once more pointed to the topic of using domain knowledge in AI. This is what critics of the brute force approach, also known as the "scaling hypothesis", point to.
What the proponents of the scaling hypothesis seem to think, simplistically put, is that intelligence is an emergent phenomenon relating to scale. Therefore, by extension, if at some point models like GPT3 become large enough, complex enough, the holy grail of AI, and perhaps science and engineering at large, artificial general intelligence (AGI), can be achieved.
How to make progress in AI, and the topic of AGI, is at least as much about philosophy as it is about science and engineering. Benaich and Hogarth approach it in a holistic way, prompted by the critique to models such as GPT3. The most prominent critic to approaches such as GPT3 is Gary Marcus. Marcus has been consistent in his critique of models predating GPT3, as the "brute force" approach does not seem to change regardless of scale.
Benaich referred to Marcus' critique, summing it up. GPT3 is an amazing language model that can take a prompt and output a sequence of text that is legible and comprehensible and in many cases relevant to what the prompt was. What's more, we should add, GPT3 can even be applied to other domains, such as writing software code for example, which is a topic in and of its own.
However, there are numerous examples where GPT3 is off course, either in a way that expresses bias, or it just produces, irrelevant results. An interesting point is how we are able to measure the performance of models like GPT3. Benaich and Hogarth note in their report that existing benchmarks for NLP, such as GLUE and SuperGLUE are now being aced by language models.
These benchmarks are meant to compare the performance of AI language models against humans at a range of tasks spanning logic, common sense understanding, and lexical semantics. A year ago, the human baseline in GLUE was beat by 1 point. Today, GLUE is reliably beat, and its more challenging sibling SuperGLUE is almost beat too.
AI language models are getting better, but does that mean we are approaching artificial general intelligence?
This can be interpreted in a number of ways. One way would be to say that AI language models are just as good as humans now. However, the kind of deficiencies that Marcus points out show this is not the case. Maybe then what this means is that we need a new benchmark. Researchers from Berkeley have published a new benchmark, which tries to capture some of these issues across various tasks.
Benaich noted that an interesting extension towards what GPT3 could do relates to the discussion around PolyAI. It's the aspect of injecting some kind of toggles to the model that allow it to have some guardrails, or at least tune what kind of outputs it can create from a given input. There are different ways that you might be able to do this, he went on to add.
Previously, the use of knowledge bases and knowledge graphs was discussed. Benaich also mentioned some kind of learned intent variable that could be used to inject this kind of control over this more general purpose sequence generator. Benaich thinks the critical view is certainly valid to some degree, and points to what models like GPT3 could use, with the goal of making them useful in production environments.
Hogarth on his part noted that Marcus is "almost a professional critic of organizations like DeepMind and OpenAI". While it's very healthy to have those critical perspectives when there is reckless hype cycle around some of this work, he went on to add, OpenAI has one of the more thoughtful approaches to policy around this.
Hogarth emphasized the underlying difference in philosophy between proponents and critics of the scaling hypothesis. However, he went on to add, if the critics are wrong, then we might have a very smart but not very well-adjusted AGI on our hands as as evidenced by sort of some of these early instances of bias as you scale these models:
"So I think it's incumbent on organizations like OpenAI if they are going to pursue this approach to tell us all how they're going to do it safely, because it's not obvious yet from their research agenda. How do you marry AI safety with this kind of this kind of throw more data and compute to the problem and AGI will emerge approach".
This discussion touched on another part of the State of AI Report 2020. Some researchers, Benaich and Hogarth noted, feel that progress in mature areas of machine learning is stagnant. Others call for a advancing causal reasoning, and claim that adding this element to machine learning approaches could overcome barriers.
Adding causality to machine learning could be the next breakthrough. The work of pioneers like Judea Pearl shows the way
Causality, Hogarth said, is arguably at the heart of much of human progress. From an epistemological perspective, causal reasoning has given us the scientific method, and it's at the heart of all of our best world models. So the work that people like Judea Pearl have pioneered to bring causality to machine learning is exciting. It feels like the biggest potential disruption to the general trend of larger and larger correlation driven models:
"I think if you can if you can crack causality, you can start to build a pretty powerful scaffolding of knowledge upon knowledge and have machines start to really contribute to our own knowledge bases and scientific processes. So I think it's very exciting. There's a reason that some of the smartest people in machine learning are spending weekends and evenings working on it.
But I think it's still in its infancy as an area of attention for the commercial community. We really only found one or two examples of it being used in the wild, one by faculty at a London based machine learning company and one by BenevolentAI in our report this year".
If you thought that's enough cutting edge AI research and applications for one report, you'd be wrong. The State of AI Report 2020 is a trove of references, and we'll revisit it soon, with more insights from Benaich and Hogarth.
More here:
- Classic reasoning systems like Loom and PowerLoom vs. more modern systems based on probalistic networks - November 8th, 2009 [November 8th, 2009]
- Using Amazon's cloud service for computationally expensive calculations - November 8th, 2009 [November 8th, 2009]
- Software environments for working on AI projects - November 8th, 2009 [November 8th, 2009]
- New version of my NLP toolkit - November 8th, 2009 [November 8th, 2009]
- Semantic Web: through the back door with HTML and CSS - November 8th, 2009 [November 8th, 2009]
- Java FastTag part of speech tagger is now released under the LGPL - November 8th, 2009 [November 8th, 2009]
- Defining AI and Knowledge Engineering - November 8th, 2009 [November 8th, 2009]
- Great Overview of Knowledge Representation - November 8th, 2009 [November 8th, 2009]
- Something like Google page rank for semantic web URIs - November 8th, 2009 [November 8th, 2009]
- My experiences writing AI software for vehicle control in games and virtual reality systems - November 8th, 2009 [November 8th, 2009]
- The URL for this blog has changed - November 8th, 2009 [November 8th, 2009]
- I have a new page on Knowledge Management - November 8th, 2009 [November 8th, 2009]
- N-GRAM analysis using Ruby - November 8th, 2009 [November 8th, 2009]
- Good video: Knowledge Representation and the Semantic Web - November 8th, 2009 [November 8th, 2009]
- Using the PowerLoom reasoning system with JRuby - November 8th, 2009 [November 8th, 2009]
- Machines Like Us - November 8th, 2009 [November 8th, 2009]
- RapidMiner machine learning, data mining, and visualization tool - November 8th, 2009 [November 8th, 2009]
- texai.org - November 8th, 2009 [November 8th, 2009]
- NLTK: The Natural Language Toolkit - November 8th, 2009 [November 8th, 2009]
- My OpenCalais Ruby client library - November 8th, 2009 [November 8th, 2009]
- Ruby API for accessing Freebase/Metaweb structured data - November 8th, 2009 [November 8th, 2009]
- Protégé OWL Ontology Editor - November 8th, 2009 [November 8th, 2009]
- New version of Numenta software is available - November 8th, 2009 [November 8th, 2009]
- Very nice: Elsevier IJCAI AI Journal articles now available for free as PDFs - November 8th, 2009 [November 8th, 2009]
- Verison 2.0 of OpenCyc is available - November 8th, 2009 [November 8th, 2009]
- What’s Your Biggest Question about Artificial Intelligence? [Article] - November 8th, 2009 [November 8th, 2009]
- Minimax Search [Knowledge] - November 8th, 2009 [November 8th, 2009]
- Decision Tree [Knowledge] - November 8th, 2009 [November 8th, 2009]
- More AI Content & Format Preference Poll [Article] - November 8th, 2009 [November 8th, 2009]
- New Planners Solve Rescue Missions [News] - November 8th, 2009 [November 8th, 2009]
- Neural Network Learns to Bluff at Poker [News] - November 8th, 2009 [November 8th, 2009]
- Pushing the Limits of Game AI Technology [News] - November 8th, 2009 [November 8th, 2009]
- Mining Data for the Netflix Prize [News] - November 8th, 2009 [November 8th, 2009]
- Interview with Peter Denning on the Principles of Computing [News] - November 8th, 2009 [November 8th, 2009]
- Decision Making for Medical Support [News] - November 8th, 2009 [November 8th, 2009]
- Neural Network Creates Music CD [News] - November 8th, 2009 [November 8th, 2009]
- jKilavuz - a guide in the polygon soup [News] - November 8th, 2009 [November 8th, 2009]
- Artificial General Intelligence: Now Is the Time [News] - November 8th, 2009 [November 8th, 2009]
- Apply AI 2007 Roundtable Report [News] - November 8th, 2009 [November 8th, 2009]
- What Would You do With 80 Cores? [News] - November 8th, 2009 [November 8th, 2009]
- Software Finds Learning Language Child's Play [News] - November 8th, 2009 [November 8th, 2009]
- Artificial Intelligence in Games [Article] - November 8th, 2009 [November 8th, 2009]
- Artificial Intelligence Resources - November 8th, 2009 [November 8th, 2009]
- Alan Turing: Mathematical Biologist? - April 25th, 2012 [April 25th, 2012]
- BBC Horizon: The Hunt for AI ( Artificial Intelligence ) - Video - April 30th, 2012 [April 30th, 2012]
- Can computers have true artificial intelligence" Masonic handshake" 3rd-April-2012 - Video - April 30th, 2012 [April 30th, 2012]
- Kevin B. Korb - Interview - Artificial Intelligence and the Singularity p3 - Video - April 30th, 2012 [April 30th, 2012]
- Artificial Intelligence - 6 Month Anniversary - Video - April 30th, 2012 [April 30th, 2012]
- Science Breakthroughs - April 30th, 2012 [April 30th, 2012]
- Hitman: Blood Money - Part 49 - Stupid Artificial Intelligence! - Video - April 30th, 2012 [April 30th, 2012]
- Research Members Turned Off By HAARP Artificial Intelligence - Video - April 30th, 2012 [April 30th, 2012]
- Artificial Intelligence Lecture No. 5 - Video - April 30th, 2012 [April 30th, 2012]
- The Artificial Intelligence Laboratory, 2012 - Video - April 30th, 2012 [April 30th, 2012]
- Charlie Rose - Artificial Intelligence - Video - April 30th, 2012 [April 30th, 2012]
- Expert on artificial intelligence to speak at EPIIC Nights dinner - May 4th, 2012 [May 4th, 2012]
- Filipino software engineers complete and best thousands on Stanford’s Artificial Intelligence Course - May 4th, 2012 [May 4th, 2012]
- Vodafone xone™ Hackathon Challenges Developers and Entrepreneurs to Build a New Generation of Artificial Intelligence ... - May 4th, 2012 [May 4th, 2012]
- Rocket Fuel Packages Up CPG Booster - May 4th, 2012 [May 4th, 2012]
- 2 Filipinos finishes among top in Stanford’s Artificial Intelligence course - May 5th, 2012 [May 5th, 2012]
- Why Your Brain Isn't A Computer - May 5th, 2012 [May 5th, 2012]
- 2 Pinoy software engineers complete Stanford's AI course - May 7th, 2012 [May 7th, 2012]
- Percipio Media, LLC Proudly Accepts Partnership With MIT's Prestigious Computer Science And Artificial Intelligence ... - May 10th, 2012 [May 10th, 2012]
- Google Driverless Car Ok'd by Nevada - May 10th, 2012 [May 10th, 2012]
- Moving Beyond the Marketing Funnel: Rocket Fuel and Forrester Research Announce Free Webinar - May 10th, 2012 [May 10th, 2012]
- Rocket Fuel Wins 2012 San Francisco Business Times Tech & Innovation Award - May 13th, 2012 [May 13th, 2012]
- Internet Week 2012: Rocket Fuel to Speak at OMMA RTB - May 16th, 2012 [May 16th, 2012]
- How to Get the Most Out of Your Facebook Ads -- Rocket Fuel's VP of Products, Eshwar Belani, to Lead MarketingProfs ... - May 16th, 2012 [May 16th, 2012]
- The Digital Disruptor To Banking Has Just Gone International - May 16th, 2012 [May 16th, 2012]
- Moving Beyond the Marketing Funnel: Rocket Fuel Announce Free Webinar Featuring an Independent Research Firm - May 23rd, 2012 [May 23rd, 2012]
- MASA Showcases Latest Version of MASA SWORD for Homeland Security Markets - May 23rd, 2012 [May 23rd, 2012]
- Bluesky Launches Drones for Aerial Surveying - May 23rd, 2012 [May 23rd, 2012]
- Artificial Intelligence: What happened to the hunt for thinking machines? - May 25th, 2012 [May 25th, 2012]
- Bubble Robots Move Using Lasers [VIDEO] - May 25th, 2012 [May 25th, 2012]
- UHV assistant professors receive $10,000 summer research grants - May 27th, 2012 [May 27th, 2012]
- Artificial intelligence: science fiction or simply science? - May 28th, 2012 [May 28th, 2012]
- Exetel taps artificial intelligence - May 29th, 2012 [May 29th, 2012]
- Software offers brain on the rain - May 29th, 2012 [May 29th, 2012]
- New Dean of Science has high hopes for his faculty - May 30th, 2012 [May 30th, 2012]
- Cognitive Code Announces "Silvia For Android" App - May 31st, 2012 [May 31st, 2012]
- A Rat is Smarter Than Google - June 5th, 2012 [June 5th, 2012]