The best artificial intelligence still has trouble visually recognizing many of Homer Simpsons favorite behaviors such as drinking beer, eating chips, eating doughnuts, yawning, and the occasional face-plant. Those findings from DeepMind, the pioneering London-based AI lab, also suggest the motive behind why DeepMind has created a huge new dataset of YouTube clips to help train AI on identifying human actions in videos that go well beyond Mmm, doughnuts or Doh!
The most popular AI used by Google, Facebook, Amazon, and other companies beyond Silicon Valley is based on deep learning algorithms that can learn to identify patterns in huge amounts of data. Over time, such algorithms can become much better at a wide variety of tasks such as translating between English and Chinese for Google Translateor automatically recognizing the faces of friends in Facebook photos.But even the most finely tuned deep learning relies on having lots of quality data to learn from.To help improve AIscapability to recognizehuman actions in motion,DeepMind has unveiled itsKinetics dataset consisting of 300,000 video clips and 400 human action classes.
AI systems are now very good at recognizing objects in images, but still have trouble making sense of videos, says aDeepMind spokesperson.One of the main reasons for this is that the research community has so far lacked a large, high-quality video dataset.
DeepMind enlisted the help of online workers through Amazons Mechanical Turk service to help correctly identify and label the actions inthousands of YouTube clips. Each of the 400 human action classes in the Kinetics dataset has at least 400 video clips, with each clip lasting around 10 seconds and taken from separate YouTube videos. More details can be found in a DeepMind paper on the arXiv preprint server.
The new Kinetics dataset seems likely to represent a new benchmark for training datasets intended to improve AI computer vision for video. It has far more video clips and action classes than the HMDB-51 and UCF-101 datasets that previously formed the benchmarks for the research community. DeepMind also made a point of ensuring it had a diverse datasetone that did not include multiple clips from the same YouTube videos.
Tech giants such as Googlea sister company to DeepMind under the umbrella Alphabet grouparguably have the best access to large amounts of video data that could prove helpful in training AI. Alphabets ownership of YouTube, the incredibly popular, online, video-streaming service, does not hurt either. But other companies and independent research groups must rely on publicly available datasets to train their deep learning algorithms.
Early training and testing with the Kinetics dataset showed some intriguing results. For example, deep learning algorithms showed accuracies of 80percent or greater in classifying actions such as playing tennis, crawling baby, presenting weather forecast, cutting watermelon, and bowling. But the classification accuracy dropped to around 20 percent or less for the Homer Simpson actions, including slapping and headbutting, and an assortment of other actions such as making a cake, tossing coin and fixing hair.
AI faces special challenges with classifying actions such as eating because it may not be able to accurately identify the specific food being consumedespecially if the hot dog or burger is already partially consumed or appears very small within the overall video. Dancing classes and actions focused on a specific part of the body can also prove tricky. Some actions also occur fairly quickly and are only visible for a small number of frames within a video clip, according to a DeepMind spokesperson.
DeepMind also wanted to see if the new Kinetics dataset has enough gender balance to allow for accurate AI training. Past cases have shown how imbalanced training datasets can lead to deep learning algorithms performing worse at recognizing the faces of certain ethnic groups. Researchers have also shown how such algorithms can pick up gender and racial biases from language.
A preliminary study showed that the new Kinetics dataset seems to fairly balanced. DeepMind researchers found that no single gender dominated within 340 out of the 400 action classesor else it was not possible to determine gender in those actions. Those action classes that did end up gender imbalanced included YouTube clips of actionssuch as shaving beard or dunking basketball (mostly male) and filling eyebrows or cheerleading (mostly female).
But even action classes that had gender imbalance did not show much evidence of classifier bias. This means that even the Kinetics action classes featuring mostly male participantssuch as playing poker or hammer throwdid not seem to bias AI to the point where the deep learning algorithms had trouble recognizing female participants performing the same actions.
DeepMind hopes that outside researchers can help suggest new human action classes for the Kinetics dataset. Any improvements may enable AI trained on Kinetics to better recognize both the most elegant of actions and the clumsier moments in videos that lead people to say doh! In turn, that could lead to new generations of computer software and robots with the capacity to recognize what all those crazy humans are doing on YouTube or in other video clips.
Video understanding represents a significant challenge for the research community, and we are in the very early stages with this, according to the DeepMind spokesperson. Any real-world applications are still a really long way off, but you can see potential in areas such as medicine, for example, aiding the diagnosis of heart problems in echocardiograms.
IEEE Spectrums general technology blog, featuring news, analysis, and opinions about engineering, consumer electronics, and technology and society, from the editorial staff and freelance contributors.
Sign up for the Tech Alert newsletter and receive ground-breaking technology and science news from IEEE Spectrum every Thursday.
A deep learning approach could make self-driving cars better at adapting to new situations 26Apr2016
A tech startup aims to spread the wealth of deep learning AI to many industries 3Mar2016
Google engineers balanced speed and accuracy to deploy deep learning in Chinese-to-English translations 3Oct2016
If machine learning systems can be taught using simulated data from Grand Theft Auto V instead of data annotated by humans, we could get to reliable vehicle autonomy much faster 8Jun
Adversarial grasping helps robots learn better ways of picking up and holding onto objects 5Jun
Reverse engineering 1 cubic millimeter of brain tissue could lead to better artificial neural networks 30May
The FDA needs computer experts with industry experience to help oversee AI-driven health apps and wearables software 29May
The prototype chip learns a style of music, then composes its own tunes 23May
Crashing into objects has taught this drone to fly autonomously, by learning what not to do 10May
Silicon Valley startup Verdigris cloud-based analysis can tell whether youre using a Chromebook or a Mac, or whether a motor is running fine or starting to fail 3May
An artificial intelligence program correctly identifies 355 more patients who developed cardiovascular disease 1May
MITs WiGait wall sensor can unobtrusively monitor people for many health conditions based on their walking patterns 1May
Facebook's Yael Maguire talks about millimeter wave networks, Aquila, and flying tethered antennas at the F8 developer conference 19Apr
Machine learning uses data from smartphones and wearables to identify signs of relationship conflicts 18Apr
Machine-learning algorithms that readily pick up cultural biases may pose ethical problems 13Apr
AI and robots have to work in a way that is beneficial to people beyond reaching functional goals and addressing technical problems 29Mar
Understanding when they don't understand will help make robots more useful 15Mar
Palo Alto startup twoXAR partners with Santen Pharmaceutical to identify new glaucoma drugs; efforts on rare skin disease, liver cancer, atherosclerosis, and diabetic nephropathy also under way 13Mar
And they have a new piece of hardwarethe Jetson TX2that they hope everyone will use for this edge processing 8Mar
A deep-learning AI has beaten human poker pros with the hardware equivalent of a gaming laptop 2Mar
View post:
DeepMind Shows AI Has Trouble Seeing Homer Simpson's Actions - IEEE Spectrum
- Classic reasoning systems like Loom and PowerLoom vs. more modern systems based on probalistic networks - November 8th, 2009 [November 8th, 2009]
- Using Amazon's cloud service for computationally expensive calculations - November 8th, 2009 [November 8th, 2009]
- Software environments for working on AI projects - November 8th, 2009 [November 8th, 2009]
- New version of my NLP toolkit - November 8th, 2009 [November 8th, 2009]
- Semantic Web: through the back door with HTML and CSS - November 8th, 2009 [November 8th, 2009]
- Java FastTag part of speech tagger is now released under the LGPL - November 8th, 2009 [November 8th, 2009]
- Defining AI and Knowledge Engineering - November 8th, 2009 [November 8th, 2009]
- Great Overview of Knowledge Representation - November 8th, 2009 [November 8th, 2009]
- Something like Google page rank for semantic web URIs - November 8th, 2009 [November 8th, 2009]
- My experiences writing AI software for vehicle control in games and virtual reality systems - November 8th, 2009 [November 8th, 2009]
- The URL for this blog has changed - November 8th, 2009 [November 8th, 2009]
- I have a new page on Knowledge Management - November 8th, 2009 [November 8th, 2009]
- N-GRAM analysis using Ruby - November 8th, 2009 [November 8th, 2009]
- Good video: Knowledge Representation and the Semantic Web - November 8th, 2009 [November 8th, 2009]
- Using the PowerLoom reasoning system with JRuby - November 8th, 2009 [November 8th, 2009]
- Machines Like Us - November 8th, 2009 [November 8th, 2009]
- RapidMiner machine learning, data mining, and visualization tool - November 8th, 2009 [November 8th, 2009]
- texai.org - November 8th, 2009 [November 8th, 2009]
- NLTK: The Natural Language Toolkit - November 8th, 2009 [November 8th, 2009]
- My OpenCalais Ruby client library - November 8th, 2009 [November 8th, 2009]
- Ruby API for accessing Freebase/Metaweb structured data - November 8th, 2009 [November 8th, 2009]
- Protégé OWL Ontology Editor - November 8th, 2009 [November 8th, 2009]
- New version of Numenta software is available - November 8th, 2009 [November 8th, 2009]
- Very nice: Elsevier IJCAI AI Journal articles now available for free as PDFs - November 8th, 2009 [November 8th, 2009]
- Verison 2.0 of OpenCyc is available - November 8th, 2009 [November 8th, 2009]
- What’s Your Biggest Question about Artificial Intelligence? [Article] - November 8th, 2009 [November 8th, 2009]
- Minimax Search [Knowledge] - November 8th, 2009 [November 8th, 2009]
- Decision Tree [Knowledge] - November 8th, 2009 [November 8th, 2009]
- More AI Content & Format Preference Poll [Article] - November 8th, 2009 [November 8th, 2009]
- New Planners Solve Rescue Missions [News] - November 8th, 2009 [November 8th, 2009]
- Neural Network Learns to Bluff at Poker [News] - November 8th, 2009 [November 8th, 2009]
- Pushing the Limits of Game AI Technology [News] - November 8th, 2009 [November 8th, 2009]
- Mining Data for the Netflix Prize [News] - November 8th, 2009 [November 8th, 2009]
- Interview with Peter Denning on the Principles of Computing [News] - November 8th, 2009 [November 8th, 2009]
- Decision Making for Medical Support [News] - November 8th, 2009 [November 8th, 2009]
- Neural Network Creates Music CD [News] - November 8th, 2009 [November 8th, 2009]
- jKilavuz - a guide in the polygon soup [News] - November 8th, 2009 [November 8th, 2009]
- Artificial General Intelligence: Now Is the Time [News] - November 8th, 2009 [November 8th, 2009]
- Apply AI 2007 Roundtable Report [News] - November 8th, 2009 [November 8th, 2009]
- What Would You do With 80 Cores? [News] - November 8th, 2009 [November 8th, 2009]
- Software Finds Learning Language Child's Play [News] - November 8th, 2009 [November 8th, 2009]
- Artificial Intelligence in Games [Article] - November 8th, 2009 [November 8th, 2009]
- Artificial Intelligence Resources - November 8th, 2009 [November 8th, 2009]
- Alan Turing: Mathematical Biologist? - April 25th, 2012 [April 25th, 2012]
- BBC Horizon: The Hunt for AI ( Artificial Intelligence ) - Video - April 30th, 2012 [April 30th, 2012]
- Can computers have true artificial intelligence" Masonic handshake" 3rd-April-2012 - Video - April 30th, 2012 [April 30th, 2012]
- Kevin B. Korb - Interview - Artificial Intelligence and the Singularity p3 - Video - April 30th, 2012 [April 30th, 2012]
- Artificial Intelligence - 6 Month Anniversary - Video - April 30th, 2012 [April 30th, 2012]
- Science Breakthroughs - April 30th, 2012 [April 30th, 2012]
- Hitman: Blood Money - Part 49 - Stupid Artificial Intelligence! - Video - April 30th, 2012 [April 30th, 2012]
- Research Members Turned Off By HAARP Artificial Intelligence - Video - April 30th, 2012 [April 30th, 2012]
- Artificial Intelligence Lecture No. 5 - Video - April 30th, 2012 [April 30th, 2012]
- The Artificial Intelligence Laboratory, 2012 - Video - April 30th, 2012 [April 30th, 2012]
- Charlie Rose - Artificial Intelligence - Video - April 30th, 2012 [April 30th, 2012]
- Expert on artificial intelligence to speak at EPIIC Nights dinner - May 4th, 2012 [May 4th, 2012]
- Filipino software engineers complete and best thousands on Stanford’s Artificial Intelligence Course - May 4th, 2012 [May 4th, 2012]
- Vodafone xone™ Hackathon Challenges Developers and Entrepreneurs to Build a New Generation of Artificial Intelligence ... - May 4th, 2012 [May 4th, 2012]
- Rocket Fuel Packages Up CPG Booster - May 4th, 2012 [May 4th, 2012]
- 2 Filipinos finishes among top in Stanford’s Artificial Intelligence course - May 5th, 2012 [May 5th, 2012]
- Why Your Brain Isn't A Computer - May 5th, 2012 [May 5th, 2012]
- 2 Pinoy software engineers complete Stanford's AI course - May 7th, 2012 [May 7th, 2012]
- Percipio Media, LLC Proudly Accepts Partnership With MIT's Prestigious Computer Science And Artificial Intelligence ... - May 10th, 2012 [May 10th, 2012]
- Google Driverless Car Ok'd by Nevada - May 10th, 2012 [May 10th, 2012]
- Moving Beyond the Marketing Funnel: Rocket Fuel and Forrester Research Announce Free Webinar - May 10th, 2012 [May 10th, 2012]
- Rocket Fuel Wins 2012 San Francisco Business Times Tech & Innovation Award - May 13th, 2012 [May 13th, 2012]
- Internet Week 2012: Rocket Fuel to Speak at OMMA RTB - May 16th, 2012 [May 16th, 2012]
- How to Get the Most Out of Your Facebook Ads -- Rocket Fuel's VP of Products, Eshwar Belani, to Lead MarketingProfs ... - May 16th, 2012 [May 16th, 2012]
- The Digital Disruptor To Banking Has Just Gone International - May 16th, 2012 [May 16th, 2012]
- Moving Beyond the Marketing Funnel: Rocket Fuel Announce Free Webinar Featuring an Independent Research Firm - May 23rd, 2012 [May 23rd, 2012]
- MASA Showcases Latest Version of MASA SWORD for Homeland Security Markets - May 23rd, 2012 [May 23rd, 2012]
- Bluesky Launches Drones for Aerial Surveying - May 23rd, 2012 [May 23rd, 2012]
- Artificial Intelligence: What happened to the hunt for thinking machines? - May 25th, 2012 [May 25th, 2012]
- Bubble Robots Move Using Lasers [VIDEO] - May 25th, 2012 [May 25th, 2012]
- UHV assistant professors receive $10,000 summer research grants - May 27th, 2012 [May 27th, 2012]
- Artificial intelligence: science fiction or simply science? - May 28th, 2012 [May 28th, 2012]
- Exetel taps artificial intelligence - May 29th, 2012 [May 29th, 2012]
- Software offers brain on the rain - May 29th, 2012 [May 29th, 2012]
- New Dean of Science has high hopes for his faculty - May 30th, 2012 [May 30th, 2012]
- Cognitive Code Announces "Silvia For Android" App - May 31st, 2012 [May 31st, 2012]
- A Rat is Smarter Than Google - June 5th, 2012 [June 5th, 2012]