CaptionBot, an AI program that applies captions to images, mistakenly described the members of the hip hop group the Wu-Tang Clan as a group of baseball players. This type of mistake often occurs because of the way certain demographics are represented in the data used to train an AI system. Credit: Walter Scheirer/CaptionBot.
In 2006, a trio of artificial intelligence (AI) researchers published a useful resource for their community, a massive dataset consisting of images representing over 50,000 different noun categories that had been automatically downloaded from the internet. The dataset, dubbed Tiny Images, was an early example of the big data strategy in AI research, whereby an algorithm is shown as many examples as possible of what it is trying to learn in order for it to better understand a given task, like recognizing objects in a photo. By uploading small 32-by-32 pixel images, the Tiny Images researchers were relying on the ability of computers to exhibit the same remarkable tolerance of the human visual system and recognize even degraded images. They also, however, may have unintentionally succeeded in recreating another human characteristic in AI systems: racial and gender bias.
A pre-print academic paper revealed that Tiny Images used several categories for images labeled with racial and misogynistic slurs. For instance, a derogatory term for sex workers was one category; a slur for women, another. There was also a category of images labeled with a racist term for Black people. Any AI system trained on the dataset might recreate the biased categories as it sorted and identified objects. Tiny Images was such a large dataset and its contents so small, that it would have been a herculean, perhaps impossible, task to perform quality control to remove the offensive category labels and images. The researchers, Antonio Torralba, Rob Fergus, and Bill Freeman, made waves in the artificial intelligence world when they announced earlier this summer that they would be pulling the whole thing from public use.
Biases, offensive and prejudicial images, and derogatory terminology alienates an important part of our community precisely those that we are making efforts to include, Torralba, Fergus, and Freeman wrote. It also contributes to harmful biases in AI systems trained on such data.
Developers used the Tiny Images data as raw material to train AI algorithms that computers use to solve visual recognition problems and recognize people, places, and things. Smartphone photo apps, for instance, use similar algorithms to automatically identify photos of skylines or beaches in your collection. The dataset was just one of many used in AI development.
While the pre-prints shocking findings about Tiny Images was troubling in its own right, the issues the paper highlighted are also indicative of a larger problem facing all AI researchers: the many ways in which bias can creep into the development cycle.
The big data era and bias in AI training data. The internet reached an inflection point in 2008 with the introduction of Apples iPhone3G. This was the first smartphone with viable processing power for general internet use, and, importantly, it included a digital camera that could be available to the user at a moments notice. Many manufacturers followed Apples lead with similar competing devices, bringing the entire world online for the first time. A software innovation that appeared with these smartphones was the ability of an app running on a phone to easily share photos from the camera to privately owned cloud storage.
This was the launch of the era of big data for AI development, and large technology companies had an additional motive beyond creating a good user experience: By concentrating a staggering amount of user-generated content within their own platforms, companies could exploit the vast repositories of data they collected to develop other products. The prevailing wisdom in AI product development has been that if enough data is collected, any problem involving intelligence can be solved. This idea has been extended to everything from face recognition to self-driving cars, and is the dominant strategy for attempting to replicate the competencies of the human brain in a computer.
But using big data for AI development has been problematic in practice.
The datasets used for AI product development now contain millions of images, and nobody knows what exactly is in them. They are too large to examine manually in an exhaustive manner. When it comes to the use of these sets, the data can be labeled or unlabeled. If the data is labeled (as was the case with Tiny Images), those labels can be tags that were taken from the original source, new labels assigned by volunteers or people who have been paid to provide them, or automatically generated by an algorithm trained to label data.
Dataset labels can be naturally bad, reflecting the biases and outright malice of the humans who annotated them, or artificially bad, if the mistakes are made by algorithms. A dataset could even be poisoned by malicious actors intending to create problems for any algorithms that make use of it. Additionally, some datasets contain unlabeled data, but these datasets, used in conjunction with algorithms that are designed to explore a problem on their own, arent the antidote for poorly labeled data. As is also the case for labeled datasets, the information contained within unlabeled datasets can be a mismatch with the real world, for instance when certain demographics are under- or over-represented in the data.
In 2016, Microsoft released a web app called CaptionBot that automatically added captions to images. The app was meant to be a successful demonstration of the companys computer vision API for third-party developers, but even under normal use, it could make some dubious mistakes. In one notable instance, it mislabeled a photo of the hip hop group Wu-Tang Clan as a group of baseball players. This type of mistake can occur when a bias exists in the dataset for a specific demographic. For example, Black athletes are often overrepresented in datasets assembled from public content on the internet. The prominent Labeled Faces in the Wild dataset for face recognition research contains this form of bias.
More problematic examples of this phenomenon have surfaced. Joy Buolamwini, a scholar at the MIT Media Lab, and Timnit Gebru, a researcher at Google, have shown that commonly used datasets for face recognition algorithm development are overwhelmingly composed of lighter skinned faces, causing face recognition algorithms to have noticeable disparities in matching accuracy between darker and lighter skin tones. Buolamwini, working with Deborah Raji, a student in her laboratory, has gone on to demonstrate this same problem in Amazons Rekognition face analysis platform. These studies have prompted IBM to announce it would exit the facial recognition and analysis market and Amazon and Microsoft to halt sales of facial recognition products to law enforcement agencies, which use the technology to identify suspects.
The manifestation of bias often begins with the choice of an application, and further crops up in the design and implementation of an algorithm well before any training data is provided to it. For instance, while its not possible to develop an algorithm to predict criminality based on a persons face, algorithms can seemingly produce accurate results on this task. This is because the datasets they are trained on and evaluated with have obvious biases, such as mugshot photos that contain easily identifiable technical artifacts specific to how booking photos are taken. If a development team believes the impossible to be possible, then the damage is already done before any tainted data makes its way into the system.
There are a lot of questions AI developers should consider before launching a project. Gebru and Emily Denton, also a researcher at Google, astutely point out in a recent tutorial they presented at the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition that interdisciplinary collaboration is a good path forward for the AI development cycle. This includes engagement with marginalized groups in a manner that fosters social justice, and dialogue with experts in fields outside of computer science. Are there unintended consequences of the work being undertaken? Will the proposed application cause harm? Does a non-data driven perspective reveal that what is being attempted is impossible? These are some of the questions that need to be asked by development teams working on AI technologies.
So, was it a good thing for the research team responsible for Tiny Images to issue a complete retraction of the dataset? This is a difficult question. At the time of this writing, the published paper on the dataset has collected over 1,700 citations according to Google Scholar and the dataset it describes is likely still being used by those who have it within their possession. The retraction of Tiny Images creates a secondary problem for researchers still working with algorithms developed with Tiny Images, many of which are not flawed and worthy of further study. The ability of researchers to replicate a study is a scientific necessity, and by removing the necessary data, this becomes impossible. And an outright ban on technologies like face recognition may not be a good idea. While AI algorithms can be anti-social in contextslikepredictive policing, they can be socially acceptable in others, such as human-computer interaction.
Perhaps one way to address this is to issue a revision of a dataset that removes any problematic information, notes what has been removed from the previous version, and provides an explanation for why the revision was necessary. This may not completely remove the bias within a dataset, especially if it is very large, but it is a way to address specific instances of bias as they are discovered.
Bias in AI is not an easy problem to eliminate, and developers need to react to it in a constructive way. With effective mitigation strategies, there is certainly still a role for datasets in this type of work.
See the original post:
How to make AI less racist - Bulletin of the Atomic Scientists
- Classic reasoning systems like Loom and PowerLoom vs. more modern systems based on probalistic networks - November 8th, 2009 [November 8th, 2009]
- Using Amazon's cloud service for computationally expensive calculations - November 8th, 2009 [November 8th, 2009]
- Software environments for working on AI projects - November 8th, 2009 [November 8th, 2009]
- New version of my NLP toolkit - November 8th, 2009 [November 8th, 2009]
- Semantic Web: through the back door with HTML and CSS - November 8th, 2009 [November 8th, 2009]
- Java FastTag part of speech tagger is now released under the LGPL - November 8th, 2009 [November 8th, 2009]
- Defining AI and Knowledge Engineering - November 8th, 2009 [November 8th, 2009]
- Great Overview of Knowledge Representation - November 8th, 2009 [November 8th, 2009]
- Something like Google page rank for semantic web URIs - November 8th, 2009 [November 8th, 2009]
- My experiences writing AI software for vehicle control in games and virtual reality systems - November 8th, 2009 [November 8th, 2009]
- The URL for this blog has changed - November 8th, 2009 [November 8th, 2009]
- I have a new page on Knowledge Management - November 8th, 2009 [November 8th, 2009]
- N-GRAM analysis using Ruby - November 8th, 2009 [November 8th, 2009]
- Good video: Knowledge Representation and the Semantic Web - November 8th, 2009 [November 8th, 2009]
- Using the PowerLoom reasoning system with JRuby - November 8th, 2009 [November 8th, 2009]
- Machines Like Us - November 8th, 2009 [November 8th, 2009]
- RapidMiner machine learning, data mining, and visualization tool - November 8th, 2009 [November 8th, 2009]
- texai.org - November 8th, 2009 [November 8th, 2009]
- NLTK: The Natural Language Toolkit - November 8th, 2009 [November 8th, 2009]
- My OpenCalais Ruby client library - November 8th, 2009 [November 8th, 2009]
- Ruby API for accessing Freebase/Metaweb structured data - November 8th, 2009 [November 8th, 2009]
- Protégé OWL Ontology Editor - November 8th, 2009 [November 8th, 2009]
- New version of Numenta software is available - November 8th, 2009 [November 8th, 2009]
- Very nice: Elsevier IJCAI AI Journal articles now available for free as PDFs - November 8th, 2009 [November 8th, 2009]
- Verison 2.0 of OpenCyc is available - November 8th, 2009 [November 8th, 2009]
- What’s Your Biggest Question about Artificial Intelligence? [Article] - November 8th, 2009 [November 8th, 2009]
- Minimax Search [Knowledge] - November 8th, 2009 [November 8th, 2009]
- Decision Tree [Knowledge] - November 8th, 2009 [November 8th, 2009]
- More AI Content & Format Preference Poll [Article] - November 8th, 2009 [November 8th, 2009]
- New Planners Solve Rescue Missions [News] - November 8th, 2009 [November 8th, 2009]
- Neural Network Learns to Bluff at Poker [News] - November 8th, 2009 [November 8th, 2009]
- Pushing the Limits of Game AI Technology [News] - November 8th, 2009 [November 8th, 2009]
- Mining Data for the Netflix Prize [News] - November 8th, 2009 [November 8th, 2009]
- Interview with Peter Denning on the Principles of Computing [News] - November 8th, 2009 [November 8th, 2009]
- Decision Making for Medical Support [News] - November 8th, 2009 [November 8th, 2009]
- Neural Network Creates Music CD [News] - November 8th, 2009 [November 8th, 2009]
- jKilavuz - a guide in the polygon soup [News] - November 8th, 2009 [November 8th, 2009]
- Artificial General Intelligence: Now Is the Time [News] - November 8th, 2009 [November 8th, 2009]
- Apply AI 2007 Roundtable Report [News] - November 8th, 2009 [November 8th, 2009]
- What Would You do With 80 Cores? [News] - November 8th, 2009 [November 8th, 2009]
- Software Finds Learning Language Child's Play [News] - November 8th, 2009 [November 8th, 2009]
- Artificial Intelligence in Games [Article] - November 8th, 2009 [November 8th, 2009]
- Artificial Intelligence Resources - November 8th, 2009 [November 8th, 2009]
- Alan Turing: Mathematical Biologist? - April 25th, 2012 [April 25th, 2012]
- BBC Horizon: The Hunt for AI ( Artificial Intelligence ) - Video - April 30th, 2012 [April 30th, 2012]
- Can computers have true artificial intelligence" Masonic handshake" 3rd-April-2012 - Video - April 30th, 2012 [April 30th, 2012]
- Kevin B. Korb - Interview - Artificial Intelligence and the Singularity p3 - Video - April 30th, 2012 [April 30th, 2012]
- Artificial Intelligence - 6 Month Anniversary - Video - April 30th, 2012 [April 30th, 2012]
- Science Breakthroughs - April 30th, 2012 [April 30th, 2012]
- Hitman: Blood Money - Part 49 - Stupid Artificial Intelligence! - Video - April 30th, 2012 [April 30th, 2012]
- Research Members Turned Off By HAARP Artificial Intelligence - Video - April 30th, 2012 [April 30th, 2012]
- Artificial Intelligence Lecture No. 5 - Video - April 30th, 2012 [April 30th, 2012]
- The Artificial Intelligence Laboratory, 2012 - Video - April 30th, 2012 [April 30th, 2012]
- Charlie Rose - Artificial Intelligence - Video - April 30th, 2012 [April 30th, 2012]
- Expert on artificial intelligence to speak at EPIIC Nights dinner - May 4th, 2012 [May 4th, 2012]
- Filipino software engineers complete and best thousands on Stanford’s Artificial Intelligence Course - May 4th, 2012 [May 4th, 2012]
- Vodafone xone™ Hackathon Challenges Developers and Entrepreneurs to Build a New Generation of Artificial Intelligence ... - May 4th, 2012 [May 4th, 2012]
- Rocket Fuel Packages Up CPG Booster - May 4th, 2012 [May 4th, 2012]
- 2 Filipinos finishes among top in Stanford’s Artificial Intelligence course - May 5th, 2012 [May 5th, 2012]
- Why Your Brain Isn't A Computer - May 5th, 2012 [May 5th, 2012]
- 2 Pinoy software engineers complete Stanford's AI course - May 7th, 2012 [May 7th, 2012]
- Percipio Media, LLC Proudly Accepts Partnership With MIT's Prestigious Computer Science And Artificial Intelligence ... - May 10th, 2012 [May 10th, 2012]
- Google Driverless Car Ok'd by Nevada - May 10th, 2012 [May 10th, 2012]
- Moving Beyond the Marketing Funnel: Rocket Fuel and Forrester Research Announce Free Webinar - May 10th, 2012 [May 10th, 2012]
- Rocket Fuel Wins 2012 San Francisco Business Times Tech & Innovation Award - May 13th, 2012 [May 13th, 2012]
- Internet Week 2012: Rocket Fuel to Speak at OMMA RTB - May 16th, 2012 [May 16th, 2012]
- How to Get the Most Out of Your Facebook Ads -- Rocket Fuel's VP of Products, Eshwar Belani, to Lead MarketingProfs ... - May 16th, 2012 [May 16th, 2012]
- The Digital Disruptor To Banking Has Just Gone International - May 16th, 2012 [May 16th, 2012]
- Moving Beyond the Marketing Funnel: Rocket Fuel Announce Free Webinar Featuring an Independent Research Firm - May 23rd, 2012 [May 23rd, 2012]
- MASA Showcases Latest Version of MASA SWORD for Homeland Security Markets - May 23rd, 2012 [May 23rd, 2012]
- Bluesky Launches Drones for Aerial Surveying - May 23rd, 2012 [May 23rd, 2012]
- Artificial Intelligence: What happened to the hunt for thinking machines? - May 25th, 2012 [May 25th, 2012]
- Bubble Robots Move Using Lasers [VIDEO] - May 25th, 2012 [May 25th, 2012]
- UHV assistant professors receive $10,000 summer research grants - May 27th, 2012 [May 27th, 2012]
- Artificial intelligence: science fiction or simply science? - May 28th, 2012 [May 28th, 2012]
- Exetel taps artificial intelligence - May 29th, 2012 [May 29th, 2012]
- Software offers brain on the rain - May 29th, 2012 [May 29th, 2012]
- New Dean of Science has high hopes for his faculty - May 30th, 2012 [May 30th, 2012]
- Cognitive Code Announces "Silvia For Android" App - May 31st, 2012 [May 31st, 2012]
- A Rat is Smarter Than Google - June 5th, 2012 [June 5th, 2012]