{"id":212638,"date":"2017-03-02T11:31:15","date_gmt":"2017-03-02T16:31:15","guid":{"rendered":"http:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/uncategorized\/googles-anti-trolling-ai-can-be-defeated-by-typos-researchers-find-ars-technica.php"},"modified":"2022-09-13T03:02:11","modified_gmt":"2022-09-13T07:02:11","slug":"googles-anti-trolling-ai-can-be-defeated-by-typos-researchers-find-ars-technica","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/artificial-intelligence\/googles-anti-trolling-ai-can-be-defeated-by-typos-researchers-find-ars-technica.php","title":{"rendered":"Google&#8217;s anti-trolling AI can be defeated by typos, researchers find &#8230; &#8211; Ars Technica"},"content":{"rendered":"<p><p>    Visit any news organization's website or any social media site,    and you're bound to find some abusive or hateful language being    thrown around. As those who moderate Ars' comments know, trying    to keep a lid on trolling and abuse in comments can be an    arduous and thankless task: whendone too heavily, it    smacks of censorship and suppression of free speech; when    applied too lightly, it can poison the community and keep    people from sharing their thoughts out of fear of being    targeted. And human-based moderation is time-consuming.  <\/p>\n<p>    Both of these problems are the target of a project by Jigsaw, an Alphabet startup    effort spun off from Google. Jigsaw's    Perspective project is an application interface currently    focused on moderating online conversationsusing machine    learning to spot abusive, harassing, and toxic comments. The AI    applies a \"toxicity score\" to comments, which can be used to    either aide moderation or to reject comments outright, giving    the commenter feedback about why their post was rejected.    Jigsaw is currently partnering with     Wikipedia and The New York Times, among others, to    implement the Perspective API to assist in moderating    reader-contributed content.  <\/p>\n<p>    But that AI still needs some training, as researchers at the    University of Washington's Network Security Lab recently    demonstrated. In a paper published on    February 27, Hossein Hosseini, Sreeram Kannan, Baosen    Zhang, and Radha Poovendran demonstrated that they could fool    the Perspective AI into giving a low toxicity score to comments    that it would otherwise flag by simply misspelling key    hot-button words (such as \"iidiot\") or inserting punctuation    into the word (\"i.diot\" or \"i d i o t,\" for example). By gaming    the AI's parsing of text, they were able to get scores that    would allow comments to pass a toxicity test that would    normally be flagged as abusive.  <\/p>\n<p>    \"One type of the vulnerabilities of machine learning algorithms    is that an adversary can change the algorithm output by subtly    perturbing the input, often unnoticeable by humans,\" Hosseini    and his co-authors wrote. \"Such inputs are called adversarial    examples, and have been shown to be effective against different    machine learning algorithms even when the adversary has only a    black-box access to the target model.\"  <\/p>\n<p>    The researchers also found that Perspective would flag comments    that were not abusive in nature but used keywords that the AI    had been trained to see as abusive. The phrases \"not stupid\" or    \"not an idiot\" scored nearly as high on Perspective's toxicity    scale as comments that used \"stupid\" and \"idiot.\"  <\/p>\n<p>    These sorts of false positives, coupled with easy evasion of    the algorithms by adversaries seeking to bypass screening,    belie the basic problem with any sort of automated moderation    and censorship. Update: CJ    Adams,Jigsaw's product manager for Perspective,    acknowledged the difficulty in a statement he sent to Ars:  <\/p>\n<p>      It's great to see research like this. Online      toxicity is a difficult problem, and Perspective was      developed to support exploration of how ML can be used to      help discussion. We welcome academic researchers to join our      research efforts on Github and explore how we can collaborate      together to identify shortcomings of existing models and find      ways to improve them.    <\/p>\n<p>      Perspective is still a very early-stage      technology, and as these researchers rightly point out, it      will only detect patterns that are similar to examples of      toxicity it has seen before. We have more details on this      challenge and others on the Conversation AI research page.      The API allows users and researchers to submit corrections      like these directly, which will then be used to improve the      model and ensure it can to understand more forms of toxic      language, and evolve as new forms emerge over time.    <\/p>\n<\/p>\n<p><!-- Auto Generated --><\/p>\n<p>Visit link: <\/p>\n<p><a target=\"_blank\" rel=\"nofollow noopener\" href=\"https:\/\/arstechnica.com\/information-technology\/2017\/03\/googles-anti-trolling-ai-can-be-defeated-by-typos-researchers-find\/\" title=\"Google's anti-trolling AI can be defeated by typos, researchers find ... - Ars Technica\">Google's anti-trolling AI can be defeated by typos, researchers find ... - Ars Technica<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> Visit any news organization's website or any social media site, and you're bound to find some abusive or hateful language being thrown around. As those who moderate Ars' comments know, trying to keep a lid on trolling and abuse in comments can be an arduous and thankless task: whendone too heavily, it smacks of censorship and suppression of free speech; when applied too lightly, it can poison the community and keep people from sharing their thoughts out of fear of being targeted. And human-based moderation is time-consuming <a href=\"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/artificial-intelligence\/googles-anti-trolling-ai-can-be-defeated-by-typos-researchers-find-ars-technica.php\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"limit_modified_date":"","last_modified_date":"","_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[13],"tags":[],"class_list":["post-212638","post","type-post","status-publish","format-standard","hentry","category-artificial-intelligence"],"modified_by":"Danzig","_links":{"self":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/212638"}],"collection":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/comments?post=212638"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/212638\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/media?parent=212638"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/categories?post=212638"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/tags?post=212638"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}