{"id":208273,"date":"2017-07-27T10:27:46","date_gmt":"2017-07-27T14:27:46","guid":{"rendered":"http:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/mozilla-is-crowdsourcing-voice-recognition-to-make-ai-work-for-the-people-the-verge\/"},"modified":"2017-07-27T10:27:46","modified_gmt":"2017-07-27T14:27:46","slug":"mozilla-is-crowdsourcing-voice-recognition-to-make-ai-work-for-the-people-the-verge","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/ai\/mozilla-is-crowdsourcing-voice-recognition-to-make-ai-work-for-the-people-the-verge\/","title":{"rendered":"Mozilla is crowdsourcing voice recognition to make AI work for the people &#8211; The Verge"},"content":{"rendered":"<p><p>    Data is critical to building great AI  so much so, that    researchers in the field     compare it to coal during the Industrial Revolution. Those    that have it will steam ahead. Those that dont will be left in    the dust. In the current AI boom, its obvious who has it: tech    giants like Google, Facebook, and Baidu.  <\/p>\n<p>    Thats worrying news. After all, many of these companies have    near monopolies in areas like search and social media. Their    position helps them gather data, which helps them build better    AI, which helps them stay ahead of rivals. For the firms    themselves, its a virtuous cycle, but without viable    competition, companies can  and do      abuse their dominance.  <\/p>\n<p>    Now a new project from the Mozilla Foundation (the nonprofit    creator of the Firefox browser) is experimenting with an    alternative to data monopolies, by asking users to pool    information in order to power open-sourced AI initiatives. The    companys first project is called Common Voice, with the Mozilla    foundation asking volunteers to donate vocal samples to build    an open-source voice recognition system like the ones powering    Siri and Alexa.  <\/p>\n<p>    the power to control speech recognition could end up in    just a few hands.  <\/p>\n<p>    Currently, the power to control speech recognition could end    up in just a few hands, and we didnt want to see that, Sean    White, vice president of emerging technology at Mozilla, tells    The Verge. He says to get data, the big companies can    just filter everything coming in, but for other players, there    needs to be other methods. The interesting question for us,    is, can we do it so the people who are creating the data also    benefit? he asks.  <\/p>\n<p>    At the moment, Mozilla is just collecting data, but plans to    have its open-source voice recognition available by the end of    the year. (Will it go in the Firefox browser? White wont say,    but adds: We have some experiments planned [for that].)    Currently, anyone can go to the Common Voice website and    donate their voice by reading out sample sentences. They can    also supply biographical information like age, location,    gender, and accent. This information will help Mozilla avoid    bias in creating its voice recognition systems, says White, and    ensure that the technology can handle accents  something    Google and Apple still struggle with.  <\/p>\n<p>    Frederike Kaltheuner, a researcher at Privacy International,    says these firms often use AI as a pretext for scooping up    valuable personal data, telling users it will enable them to    improve certain services. This may be true, she says, but the    consequences of sharing this data for society at large is less    clear. There is [often] a fundamental conflict of interest    between what you need as a citizen, and what is in that    companys interest, says Kaltheuner.  <\/p>\n<p>    What can open-source data offer that companies cant?  <\/p>\n<p>    So how does an initiative like Common Voice lure users away    from existing  and admittedly convenient  services? After    all, open-source projects have been around for longer than the    internet, but with a few exceptions, they have been unable to    compete with commercial products. They simply dont offer a    comparable service.  <\/p>\n<p>    For Mozilla, the answer is personalization. After all, while AI    systems trained on population-sized datasets tend to be good    enough for the average individual, they often fail when it    comes to serving the needs of smaller groups, or those not    represented in their data. (More often than not, the data is    just biased toward white males, the industry default.)  <\/p>\n<p>    For us to be successful with data commons, there has to be a    motivation [for users] other than realizing one day that    theyve been giving away all their personal data, says White.    We have to make their experience better because theyve    participated. In the case of Common Voice, White wants as much    accent data as possible to improve voice recognition for these    individuals. We want the system to work better for you because    some of your data is included, he says.  <\/p>\n<p>    Offering personalization in exchange for data is a neat    proposition, but its not a silver bullet for those fighting    data monopolies. For a start, big firms could make similar    offers of their own to users. (Alexa doesnt understand you?    Read this 10-minute script and well improve its voice    recognition.) Or they could spend money to plug the gaps in    their own datasets. Google, for example, gets third-party    companies to pay Redditors with accents to record their own    voice samples.  <\/p>\n<p>    White acknowledges that the Common Voice project doesnt have    an answer to a lot of these questions, but says Mozilla is    still dedicated to the core cause of open data. It feels like    a true democratizing activity, he says. And there are plenty    of organizations that share this ethos. Theres machine    learning community Kaggle, which has a large store of    user-contributed datasets for AI scientists to play with; the    Elon Musk-funded OpenAI, which open-sources all its work; and    Healthcare.ai, which publishes free-to-use medical algorithms.    And some of these manage to both share open-source data and    research while selling their own commercial products, like    self-driving car startup Comma.AI.  <\/p>\n<p>    Although the AI systems we interact with on a daily basis are    built on proprietary data, theres a whole world of researchers    and institutions publishing useful, if rudimentary, open-source    alternatives.  <\/p>\n<p>    To take these projects to the next level, though, proponents of    open-source data may have enlist higher powers to take on the    tech giants. Chris Nicholson, CEO of deep learning company    Skymind, says, We may need third parties to step in  NGOs,    governments, coalitions of smaller private firms  and pool    their data. Nicholson suggests that sharing health care data    can improve medical imaging technology, and driver data can    make autonomous cars more natural and intuitive on the road.    Sharing these types of datasets, he says, has obvious public    benefits.  <\/p>\n<p>    Donating your voice, then, may just be the beginning.  <\/p>\n<p><!-- Auto Generated --><\/p>\n<p>Go here to read the rest: <\/p>\n<p><a target=\"_blank\" rel=\"nofollow\" href=\"https:\/\/www.theverge.com\/2017\/7\/27\/16019222\/open-source-data-ai-mozilla-common-voice-project\" title=\"Mozilla is crowdsourcing voice recognition to make AI work for the people - The Verge\">Mozilla is crowdsourcing voice recognition to make AI work for the people - The Verge<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> Data is critical to building great AI so much so, that researchers in the field compare it to coal during the Industrial Revolution. Those that have it will steam ahead. Those that dont will be left in the dust.  <a href=\"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/ai\/mozilla-is-crowdsourcing-voice-recognition-to-make-ai-work-for-the-people-the-verge\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[187743],"tags":[],"class_list":["post-208273","post","type-post","status-publish","format-standard","hentry","category-ai"],"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/posts\/208273"}],"collection":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/comments?post=208273"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/posts\/208273\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/media?parent=208273"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/categories?post=208273"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/tags?post=208273"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}