Today we are releasing an open automation framework, PyRIT (Python Risk Identification Toolkit for generative AI), to empower security professionals and machine learning engineers to proactively find risks in their generative AI systems.
At Microsoft, we believe that security practices and generative AI responsibilities need to be a collaborative effort. We are deeply committed to developing tools and resources that enable every organization across the globe to innovate responsibly with the latest artificial intelligence advances. This tool, and the previous investments we have made in red teaming AI since 2019, represents our ongoing commitment to democratize securing AI for our customers, partners, and peers.
Red teaming AI systems is a complex, multistep process. Microsofts AI Red Team leverages a dedicated interdisciplinary group of security, adversarial machine learning, and responsible AI experts. The Red Team also leverages resources from the entire Microsoft ecosystem, including the Fairness center in Microsoft Research; AETHER, Microsofts cross-company initiative on AI Ethics and Effects in Engineering and Research; and the Office of Responsible AI. Our red teaming is part of our larger strategy to map AI risks, measure the identified risks, and then build scoped mitigations to minimize them.
Over the past year, we have proactively red teamed several high-value generative AI systems and models before they were released to customers. Through this journey, we found that red teaming generative AI systems is markedly different from red teaming classical AI systems or traditional software in three prominent ways.
We first learned that while red teaming traditional software or classical AI systems mainly focuses on identifying security failures, red teaming generative AI systems includes identifying both security risk as well as responsible AI risks. Responsible AI risks, like security risks, can vary widely, ranging from generating content that includes fairness issues to producing ungrounded or inaccurate content. AI red teaming needs to explore the potential risk space of security and responsible AI failures simultaneously.
Secondly, we found that red teaming generative AI systems is more probabilistic than traditional red teaming. Put differently, executing the same attack path multiple times on traditional software systems would likely yield similar results. However, generative AI systems have multiple layers of non-determinism; in other words, the same input can provide different outputs. This could be because of the app-specific logic; the generative AI model itself; the orchestrator that controls the output of the system can engage different extensibility or plugins; and even the input (which tends to be language), with small variations can provide different outputs. Unlike traditional software systems with well-defined APIs and parameters that can be examined using tools during red teaming, we learned that generative AI systems require a strategy that considers the probabilistic nature of their underlying elements.
Finally, the architecture of these generative AI systems varies widely: from standalone applications to integrations in existing applications to the input and output modalities, such as text, audio, images, and videos.
These three differences make a triple threat for manual red team probing. To surface just one type of risk (say, generating violent content) in one modality of the application (say, a chat interface on browser), red teams need to try different strategies multiple times to gather evidence of potential failures. Doing this manually for all types of harms, across all modalities across different strategies, can be exceedingly tedious and slow.
This does not mean automation is always the solution. Manual probing, though time-consuming, is often needed for identifying potential blind spots. Automation is needed for scaling but is not a replacement for manual probing. We use automation in two ways to help the AI red team: automating our routine tasks and identifying potentially risky areas that require more attention.
In 2021, Microsoft developed and released a red team automation framework for classical machine learning systems. Although Counterfit still delivers value for traditional machine learning systems, we found that for generative AI applications, Counterfit did not meet our needs, as the underlying principles and the threat surface had changed. Because of this, we re-imagined how to help security professionals to red team AI systems in the generative AI paradigm and our new toolkit was born.
We like to acknowledge out that there have been work in the academic space to automate red teaming such as PAIR and open source projects including garak.
PyRIT is battle-tested by the Microsoft AI Red Team. It started off as a set of one-off scripts as we began red teaming generative AI systems in 2022. As we red teamed different varieties of generative AI systems and probed for different risks, we added features that we found useful. Today, PyRIT is a reliable tool in the Microsoft AI Red Teams arsenal.
The biggest advantage we have found so far using PyRIT is our efficiency gain. For instance, in one of our red teaming exercises on a Copilot system, we were able to pick a harm category, generate several thousand malicious prompts, and use PyRITs scoring engine to evaluate the output from the Copilot system all in the matter of hours instead of weeks.
PyRIT is not a replacement for manual red teaming of generative AI systems. Instead, it augments an AI red teamers existing domain expertise and automates the tedious tasks for them. PyRIT shines light on the hot spots of where the risk could be, which the security professional than can incisively explore. The security professional is always in control of the strategy and execution of the AI red team operation, and PyRIT provides the automation code to take the initial dataset of harmful prompts provided by the security professional, then uses the LLM endpoint to generate more harmful prompts.
However, PyRIT is more than a prompt generation tool; it changes its tactics based on the response from the generative AI system and generates the next input to the generative AI system. This automation continues until the security professionals intended goal is achieved.
Abstraction and Extensibility is built into PyRIT. Thats because we always want to be able to extend and adapt PyRITs capabilities to new capabilities that generative AI models engender. We achieve this by five interfaces: target, datasets, scoring engine, the ability to support multiple attack strategies and providing the system with memory.
PyRIT was created in response to our belief that the sharing of AI red teaming resources across the industry raises all boats. We encourage our peers across the industry to spend time with the toolkit and see how it can be adopted for red teaming your own generative AI application.
Project created by Gary Lopez; Engineering: Richard Lundeen, Roman Lutz, Raja Sekhar Rao Dheekonda, Dr. Amanda Minnich; Broader involvement from Shiven Chawla, Pete Bryan, Peter Greko, Tori Westerhoff, Martin Pouliot, Bolor-Erdene Jagdagdorj, Chang Kawaguchi, Charlotte Siska, Nina Chikanov, Steph Ballard, Andrew Berkley, Forough Poursabzi, Xavier Fernandes, Dean Carignan, Kyle Jackson, Federico Zarfati, Jiayuan Huang, Chad Atalla, Dan Vann, Emily Sheng, Blake Bullwinkel, Christiano Bianchet, Keegan Hines, eric douglas, Yonatan Zunger, Christian Seifert, Ram Shankar Siva Kumar. Grateful for comments from Jonathan Spring.
To learn more about Microsoft Security solutions, visit ourwebsite.Bookmark theSecurity blogto keep up with our expert coverage on security matters. Also, follow us on LinkedIn (Microsoft Security) and X (@MSFTSecurity)for the latest news and updates on cybersecurity.
Link:
Announcing Microsofts open automation framework to red team generative AI Systems - Microsoft
- Chinese national arrested and charged with stealing AI trade secrets from Google - NPR - March 8th, 2024 [March 8th, 2024]
- President Biden Calls for Ban on AI Voice Impersonations During State of the Union - Variety - March 8th, 2024 [March 8th, 2024]
- Revolutionize Your Business with AWS Generative AI Competency Partners | Amazon Web Services - AWS Blog - March 8th, 2024 [March 8th, 2024]
- Broadcom Expects AI Demand to Help Offset Weakness Elsewhere - Yahoo Finance - March 8th, 2024 [March 8th, 2024]
- Micron Hits Record High With Analysts Calling It an 'Under-Appreciated AI Beneficiary' - Investopedia - March 8th, 2024 [March 8th, 2024]
- The Adams administration quietly hired its first AI czar. Who is he? - City & State New York - March 8th, 2024 [March 8th, 2024]
- AI likely to increase energy use and accelerate climate misinformation report - The Guardian - March 8th, 2024 [March 8th, 2024]
- This Artificial Intelligence (AI) Stock Could Double, and It Is Way Cheaper Than Nvidia - Yahoo Finance - March 8th, 2024 [March 8th, 2024]
- Fake images made to show Trump with Black supporters highlight concerns around AI and elections - The Associated Press - March 8th, 2024 [March 8th, 2024]
- Artificial intelligence and illusions of understanding in scientific research - Nature.com - March 8th, 2024 [March 8th, 2024]
- Analysis | House AI task force leaders take long view on regulating the tools - The Washington Post - March 8th, 2024 [March 8th, 2024]
- Don't Give Your Business Data to AI Companies - Dark Reading - March 8th, 2024 [March 8th, 2024]
- NIST, the lab at the center of Bidens AI safety push, is decaying - The Washington Post - March 8th, 2024 [March 8th, 2024]
- Essay | AI is Coming! Tips for Staying Calm and Carrying On - The Wall Street Journal - March 8th, 2024 [March 8th, 2024]
- AI can be easily used to make fake election photos - report - BBC.com - March 8th, 2024 [March 8th, 2024]
- 5 Artificial Intelligence (AI) Stocks That Could Make You a Millionaire - Yahoo Finance - March 8th, 2024 [March 8th, 2024]
- AI could be an extraordinary force for good. So why do our politicians still not have a plan? - The Guardian - March 8th, 2024 [March 8th, 2024]
- Mapping Disease Trajectories from Birth to Death with AI - Neuroscience News - March 8th, 2024 [March 8th, 2024]
- India plans 10,000-GPU sovereign AI supercomputer - The Register - March 8th, 2024 [March 8th, 2024]
- SAP enhances Datasphere and SAC for AI-driven transformation - CIO - March 8th, 2024 [March 8th, 2024]
- Jim Cramer names companies and sectors poised to rally on the AI wave - CNBC - March 8th, 2024 [March 8th, 2024]
- The job applicants shut out by AI: The interviewer sounded like Siri - The Guardian - March 8th, 2024 [March 8th, 2024]
- Microsoft confirms Surface and Windows AI event for March 21st - The Verge - March 8th, 2024 [March 8th, 2024]
- Adobes new Express app brings Firefly AI tools to iOS and Android - The Verge - March 8th, 2024 [March 8th, 2024]
- A Google AI Watched 30,000 Hours of Video GamesNow It Makes Its Own - Singularity Hub - March 8th, 2024 [March 8th, 2024]
- Palantir CEO Karp on TITAN, AI Warfare Technology - Bloomberg - March 8th, 2024 [March 8th, 2024]
- Elliptic Curve Murmurations Found With AI Take Flight - Quanta Magazine - March 8th, 2024 [March 8th, 2024]
- 5 AI Stocks to Buy in March 2024, According to Analysts - TipRanks.com - TipRanks - March 8th, 2024 [March 8th, 2024]
- Wix's new AI chatbot builds websites in seconds based on prompts - The Verge - March 8th, 2024 [March 8th, 2024]
- Amid record high energy demand, America is running out of electricity - The Washington Post - March 8th, 2024 [March 8th, 2024]
- AI Crypto Tokens in 5 Minutes: What to Know and Where to Start - Inc. - February 26th, 2024 [February 26th, 2024]
- 'The Worlds I See' by AI visionary Fei-Fei Li '99 selected as Princeton Pre-read - Princeton University - February 26th, 2024 [February 26th, 2024]
- AI is having a 1995 moment, analyst says - Business Insider - February 26th, 2024 [February 26th, 2024]
- Vatican research group's book outlines AI's 'brave new world' - National Catholic Reporter - February 26th, 2024 [February 26th, 2024]
- Honor's Magic 6 Pro launches internationally with AI-powered eye tracking on the way - The Verge - February 26th, 2024 [February 26th, 2024]
- Google explains Gemini's embarrassing AI pictures of diverse Nazis - The Verge - February 26th, 2024 [February 26th, 2024]
- Google cut a deal with Reddit for AI training data - The Verge - February 26th, 2024 [February 26th, 2024]
- What's the point of Elon Musk's AI company? - The Verge - February 26th, 2024 [February 26th, 2024]
- AI agents like Rabbit aim to book your vacation and order your Uber - NPR - February 26th, 2024 [February 26th, 2024]
- After Nvidia's latest blowout, here are 20 AI stocks expected to rise as much as 44% - Yahoo Finance - February 26th, 2024 [February 26th, 2024]
- 1 Exceptional AI Chip Stock Investors Need to Know About in 2024 - The Motley Fool - February 26th, 2024 [February 26th, 2024]
- Nvidia briefly hits $2 trillion valuation as AI frenzy grips Wall Street - Reuters - February 26th, 2024 [February 26th, 2024]
- AI Chatbots Can Guess Your Personal Information From What You ... - WIRED - October 18th, 2023 [October 18th, 2023]
- Harvard IT Launches Pilot of AI Sandbox to Enable Walled-Off Use ... - Harvard Crimson - October 18th, 2023 [October 18th, 2023]
- Advancing policing through AI: Insights from the global law ... - Police News - October 18th, 2023 [October 18th, 2023]
- Hochul announces new SUNY, IBM investments in AI - Olean Times Herald - October 18th, 2023 [October 18th, 2023]
- Nvidia's banking on TensorRT to expand its generative AI dominance - The Verge - October 18th, 2023 [October 18th, 2023]
- AI expands from MRFs to vehicles - Plastics Recycling Update - October 18th, 2023 [October 18th, 2023]
- AI Reads Ancient Scroll Charred by Mount Vesuvius in Tech First - Scientific American - October 18th, 2023 [October 18th, 2023]
- A DEEPer (squared) dive into AI Harvard Gazette - Harvard Gazette - October 18th, 2023 [October 18th, 2023]
- Florida bar weighs whether lawyers using AI need client consent - Reuters - October 18th, 2023 [October 18th, 2023]
- Cognizant and Vianai Systems Announce Strategic Partnership to ... - PR Newswire - October 18th, 2023 [October 18th, 2023]
- How AI could speed up scientific discoveries, from proteins to ... - NPR - October 18th, 2023 [October 18th, 2023]
- AI challenge to deliver better healthcare | Western Australian ... - Government of Western Australia - October 18th, 2023 [October 18th, 2023]
- Henry Kissinger: The Path to AI Arms Control - Foreign Affairs Magazine - October 18th, 2023 [October 18th, 2023]
- Stability AI releases StableStudio in latest push for open-source AI - The Verge - May 18th, 2023 [May 18th, 2023]
- Google CEO Sundar Pichai Predicts That This Profession Will Be ... - The Motley Fool - May 18th, 2023 [May 18th, 2023]
- Frances privacy watchdog eyes protection against data scraping in AI action plan - TechCrunch - May 18th, 2023 [May 18th, 2023]
- Investing in Hippocratic AI - Andreessen Horowitz - May 18th, 2023 [May 18th, 2023]
- As Alphabet flexes its AI prowess, there's a 'new elephant in the room' for Google - MarketWatch - May 18th, 2023 [May 18th, 2023]
- The Boring Future of Generative AI | WIRED - WIRED - May 18th, 2023 [May 18th, 2023]
- OpenAI readies new open-source AI model, The Information reports - Reuters.com - May 18th, 2023 [May 18th, 2023]
- What every CEO should know about generative AI - McKinsey - May 18th, 2023 [May 18th, 2023]
- AI creates images of the 'perfect' man and woman - Sky News - May 18th, 2023 [May 18th, 2023]
- Audit AI search tools now, before they skew research - Nature.com - May 18th, 2023 [May 18th, 2023]
- 3 Reasons C3.ai Stock Could Be Your Golden Ticket to the AI ... - InvestorPlace - May 18th, 2023 [May 18th, 2023]
- Zoom makes a big bet on AI with investment in Anthropic - VentureBeat - May 18th, 2023 [May 18th, 2023]
- AI voice phone scams are on the rise. Here's how to avoid them - USA TODAY - May 18th, 2023 [May 18th, 2023]
- Amazon is building an AI-powered conversational experience for ... - The Verge - May 18th, 2023 [May 18th, 2023]
- AI speculators need to 'differentiate between actual spending and investment' and hype: Strategist - Yahoo Finance - May 18th, 2023 [May 18th, 2023]
- AI Can Be Both Accurate and Transparent - HBR.org Daily - May 18th, 2023 [May 18th, 2023]
- You're Probably Underestimating AI Chatbots | WIRED - WIRED - May 18th, 2023 [May 18th, 2023]
- AI presents political peril for 2024 with threat to mislead voters - The Associated Press - May 18th, 2023 [May 18th, 2023]
- We need AI to help us face the challenges of the future - The Guardian - May 18th, 2023 [May 18th, 2023]
- End Of Googles Dominance? Stock Gets Rare Analyst Downgrade Over AI Fears - Forbes - May 18th, 2023 [May 18th, 2023]
- Watch 44 million atoms simulated using AI and a supercomputer - New Scientist - May 18th, 2023 [May 18th, 2023]
- AI Is The New Electricity: Bank Of America Picks 20 Stocks To Cash In On ChatGPT Hype - Forbes - March 2nd, 2023 [March 2nd, 2023]
- Tech Giants Are Barreling Headfirst Into an AI Arms Race - February 20th, 2023 [February 20th, 2023]
- Bing's AI Is Threatening Users. That's No Laughing Matter - TIME - February 20th, 2023 [February 20th, 2023]
- AI Mania: 3 Rare Pure Plays to Monitor - finance.yahoo.com - February 7th, 2023 [February 7th, 2023]