{"id":201307,"date":"2017-06-25T14:14:07","date_gmt":"2017-06-25T18:14:07","guid":{"rendered":"http:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/mit-and-google-researchers-have-made-ai-that-can-link-sound-sight-and-text-to-understand-the-world-quartz\/"},"modified":"2017-06-25T14:14:07","modified_gmt":"2017-06-25T18:14:07","slug":"mit-and-google-researchers-have-made-ai-that-can-link-sound-sight-and-text-to-understand-the-world-quartz","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/ai\/mit-and-google-researchers-have-made-ai-that-can-link-sound-sight-and-text-to-understand-the-world-quartz\/","title":{"rendered":"MIT and Google researchers have made AI that can link sound, sight, and text to understand the world &#8211; Quartz"},"content":{"rendered":"<p><p>    If we ever want future robots to do our bidding, theyll have    to understand the world around them in a complete wayif a    robot hears a barking noise, whats making it? What does a dog    look like, and what do dogs need?  <\/p>\n<p>    AI research has typically treated the ability to recognize    images, identify noises, and understand text as three different    problems, and built algorithms suited to each individual task.    Imagine if you could only use one sense at a time, and couldnt    match anything you heard to anything you saw. Thats AI today,    and part of the reason why were so far from creating an    algorithm that can learn like a human. But two new papers from    MIT and Google explain first    steps for making AI see, hear, and read in a holistic wayan    approach that could upend how we teach our machines about the    world.  <\/p>\n<p>    It doesnt matter if you see a car or hear an engine, you    instantly recognize the same concept. The information in our    brain is aligned naturally, says Yusuf Aytar, a    post-doctoral AI research at MIT who co-authored the paper.  <\/p>\n<p>    That word Aytar usesalignedis the key idea here. Researchers    arent teaching the algorithms anything new, but instead    creating a way for them to link, or align, knowledge from one    sense to another. Aytar offers the example of a self-driving    car hearing an ambulance before it sees it. The knowledge of    what an ambulance sounds like, looks like, and its function    could allow the self-driving car to prepare for other cars    around it to slow down, and move out of the way.  <\/p>\n<p>    To train this system, the MIT group first showed the neural    network video frames that were associated with audio. After the    network found the objects in the video and the sounds in the    audio, it tried to predict which objects correlated to which    sounds. At what point, for instance, do waves make a sound?  <\/p>\n<p>    Next, the team fed images with captions showing similar    situations into the same algorithm, so it could associate words    with the objects and actions pictured. Same idea: first the    network separately identified all the objects it could find in    the pictures, and the relevant words, and then matched them.  <\/p>\n<p>    The network might not seem incredibly impressive from that    descriptionafter all, we have AI that can do those things    separately. But when trained on audio\/images and images\/text,    the system was then able to match audio to text, when it had    never been trained to know which words correspond to different    sounds. Researchers claim this indicated the network had built    a more objective idea of what it was seeing, hearing, or    reading, one that didnt entirely rely on the medium it used to    learn the information.  <\/p>\n<p>    One algorithm that can align its idea of an object across    sight, sound, and text can automatically transfer what its    learned from what it hears to what it sees. Aytar offers the    examples that if the algorithm hears a zebra braying, it    assumes that a zebra is similar to a horse.  <\/p>\n<p>    It knows that [the zebra] is an animal, it knows that it    generates these kinds of sounds, and kind of inherently it    transfers this information across modalities, Aytar says.    These kinds of assumptions allow the algorithm to make new    connections between ideas, strengthening its understanding of    the world.  <\/p>\n<p>    Googles model behaves similarly, except with the addition of    being able to translate text as well. Google declined to    provide a researcher to talk more about how its network    operated. However, the algorithm has been made available online    to other researchers.  <\/p>\n<p>    Neither of these techniques from Google or MIT actually    performed better than the single-use algorithms, but Aytar says    that this wont be the case for long.  <\/p>\n<p>    If you have more senses, you have more accuracy, he said.  <\/p>\n<p><!-- Auto Generated --><\/p>\n<p>Originally posted here:<\/p>\n<p><a target=\"_blank\" rel=\"nofollow\" href=\"https:\/\/qz.com\/1011641\/mit-and-google-researchers-have-made-ai-that-can-link-sound-sight-and-text-to-understand-the-world\/\" title=\"MIT and Google researchers have made AI that can link sound, sight, and text to understand the world - Quartz\">MIT and Google researchers have made AI that can link sound, sight, and text to understand the world - Quartz<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> If we ever want future robots to do our bidding, theyll have to understand the world around them in a complete wayif a robot hears a barking noise, whats making it? What does a dog look like, and what do dogs need?  <a href=\"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/ai\/mit-and-google-researchers-have-made-ai-that-can-link-sound-sight-and-text-to-understand-the-world-quartz\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[187743],"tags":[],"class_list":["post-201307","post","type-post","status-publish","format-standard","hentry","category-ai"],"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/posts\/201307"}],"collection":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/comments?post=201307"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/posts\/201307\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/media?parent=201307"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/categories?post=201307"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/tags?post=201307"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}