{"id":168732,"date":"2024-03-10T03:17:34","date_gmt":"2024-03-10T07:17:34","guid":{"rendered":"https:\/\/www.immortalitymedicine.tv\/researchers-jailbreak-ai-chatbots-with-ascii-art-artprompt-bypasses-safety-measures-to-unlock-malicious-queries-toms-hardware\/"},"modified":"2024-08-18T12:53:22","modified_gmt":"2024-08-18T16:53:22","slug":"researchers-jailbreak-ai-chatbots-with-ascii-art-artprompt-bypasses-safety-measures-to-unlock-malicious-queries-toms-hardware","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/ai\/researchers-jailbreak-ai-chatbots-with-ascii-art-artprompt-bypasses-safety-measures-to-unlock-malicious-queries-toms-hardware.php","title":{"rendered":"Researchers jailbreak AI chatbots with ASCII art &#8212; ArtPrompt bypasses safety measures to unlock malicious queries &#8211; Tom&#8217;s Hardware"},"content":{"rendered":"<p><p>    Researchers based in Washington and Chicago have developed    ArtPrompt, a new way to circumvent the safety measures built    into     large language models (LLMs). According to the research    paper ArtPrompt: ASCII Art-based    Jailbreak Attacks against Aligned LLMs, chatbots such as    GPT-3.5,     GPT-4, Gemini, Claude, and Llama2 can be induced to respond    to queries they are designed to reject using ASCII art prompts    generated by their ArtPrompt tool. It is a simple and effective    attack, and the paper provides examples of the    ArtPrompt-induced chatbots advising on how to build bombs and    make counterfeit money.  <\/p>\n<p>        Image 1 of 2      <\/p>\n<p>        ArtPrompt consists of two steps, namely word masking and        cloaked prompt generation. In the word masking step, given        the targeted behavior that the attacker aims to provoke,        the attacker first masks the sensitive words in the prompt        that will likely conflict with the safety alignment of        LLMs, resulting in prompt rejection. In the cloaked prompt        generation step, the attacker uses an ASCII art generator        to replace the identified words with those represented in        the form of ASCII art. Finally, the generated ASCII art is        substituted into the original prompt, which will be sent to        the victim LLM to generate response.      <\/p>\n<p>        Artificial intelligence (AI) wielding chatbots are    increasingly locked down to avoid malicious abuse. AI    developers don't want their products to be subverted to promote    hateful, violent, illegal, or similarly harmful content. So, if    you were to query one of the mainstream chatbots today about    how to do something malicious or illegal, you would likely only    face rejection. Moreover, in a kind of technological game of        whack-a-mole, the major AI players have spent plenty of    time plugging linguistic and semantic holes to prevent people    from wandering outside the guardrails. This is why ArtPrompt is    quite an eyebrow-raising development.  <\/p>\n<p>    To best understand ArtPrompt and how it works, it is probably    simplest to check out the two examples provided by the research    team behind the tool. In Figure 1 above, you can see that    ArtPrompt easily sidesteps the protections of contemporary    LLMs. The tool replaces the 'safety word' with an    ASCII    art representation of the word to form a new prompt. The    LLM recognizes the ArtPrompt prompt output but sees no issue in    responding, as the prompt doesn't trigger any ethical or safety    safeguards.  <\/p>\n<\/p>\n<p>    Another example provided in the research paper shows us how to    successfully query an LLM about counterfeiting cash. Tricking a    chatbot this way seems so basic, but the ArtPrompt developers    assert how their tool fools today's LLMs \"effectively and    efficiently.\" Moreover, they claim it \"outperforms all [other]    attacks on average\" and remains a practical, viable attack for        multimodal language models for now.  <\/p>\n<p>    The last time we reported on AI chatbot jailbreaking, some    enterprising researchers from NTU were working on        Masterkey, an automated method of using the power of one    LLM to jailbreak another.  <\/p>\n<p><!-- Auto Generated --><\/p>\n<p>See original here:<\/p>\n<p><a target=\"_blank\" rel=\"nofollow noopener\" href=\"https:\/\/www.tomshardware.com\/tech-industry\/artificial-intelligence\/researchers-jailbreak-ai-chatbots-with-ascii-art-artprompt-bypasses-safety-measures-to-unlock-malicious-queries\" title=\"Researchers jailbreak AI chatbots with ASCII art -- ArtPrompt bypasses safety measures to unlock malicious queries - Tom's Hardware\">Researchers jailbreak AI chatbots with ASCII art -- ArtPrompt bypasses safety measures to unlock malicious queries - Tom's Hardware<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> Researchers based in Washington and Chicago have developed ArtPrompt, a new way to circumvent the safety measures built into large language models (LLMs).  <a href=\"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/ai\/researchers-jailbreak-ai-chatbots-with-ascii-art-artprompt-bypasses-safety-measures-to-unlock-malicious-queries-toms-hardware.php\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"limit_modified_date":"","last_modified_date":"","_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[1234935],"tags":[],"class_list":["post-168732","post","type-post","status-publish","format-standard","hentry","category-ai"],"modified_by":null,"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/168732"}],"collection":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/comments?post=168732"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/168732\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/media?parent=168732"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/categories?post=168732"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/tags?post=168732"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}