{"id":95494,"date":"2013-11-14T03:40:32","date_gmt":"2013-11-14T08:40:32","guid":{"rendered":"http:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/uncategorized\/can-artificial-intelligence-like-ibms-watson-do-investigative-journalism.php"},"modified":"2013-11-14T03:40:32","modified_gmt":"2013-11-14T08:40:32","slug":"can-artificial-intelligence-like-ibms-watson-do-investigative-journalism","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/artificial-intelligence\/can-artificial-intelligence-like-ibms-watson-do-investigative-journalism.php","title":{"rendered":"Can Artificial Intelligence Like IBM&#39;s Watson Do Investigative Journalism?"},"content":{"rendered":"<p><p>    Two years ago, the two greatest Jeopardy champions of all time    got obliterated by a computer called Watson. It was a great    victory for artificial intelligence--the system racked up more    than three times the earnings of its next meat-brained    competitor. For IBMs Watson, the successor to Deep Blue, which    famously defeated chess champion Gary Kasparov, becoming a    Jeopardy champion was a modest proof of concept. The big    challenge for Watson, and the goal for IBM, is to adapt the    core question-answering technology to more significant domains,    like health care.  <\/p>\n<p>    WatsonPaths, IBMs medical-domain offshoot    announced last month, is able to derive medical diagnoses from    a description of symptoms. From this chain of evidence, its    able to present an interactive visualization to doctors, who    can interrogate the data, further question the evidence, and    better understand the situation. Its an essential feedback    loop used by diagnosticians to help decide which information is    extraneous and which is essential, thus making it possible to    home in on a most-likely diagnosis.  <\/p>\n<p>    WatsonPaths scours millions of unstructured texts, like medical    textbooks, dictionaries, and clinical guidelines, to develop a    set of ranked hypotheses. The doctors feedback is added back    into the brute-force information retrieval capabilities to help    further train the system. Thats the AI part, which also    provides transparency for the systems diagnosis. Eventually,    this knowledge will be used to articulate uncertainty,    identifying information gaps and asking questions to help it    gather more evidence.  <\/p>\n<p>    Health care is just the beginning for Watson. Other disciplines    that rely on evidentiary reasoning from unstructured documents    or the Deep Web, including law, education, and finance,    are also on the road map. But lets consider another potential    domain here, perhaps less lucrative than the others, but    nonetheless important: news and journalism.  <\/p>\n<p>    Media startup Vocativ identifies hot news stories by    trawling the depth of the web, data-mining the vast seas of    unindexed documents for information that might point to a story    lead. Often journalists pair up with analysts, manually    exploring data from different perspectives. The Associated    Presss Overview Project aims to build better    visualization and analysis tools for investigative journalists    to make sense of huge document sets.  <\/p>\n<p>    What if much of this could be automated? A cognitive computer,    like Watson, could search reams of evidence, generate    hypotheses, and collect supporting and\/or contradicting    evidence. Potential news stories would be presented to    journalists and analysts who would weigh the evidence,    assessing its accuracy, and decide which story ideas to pass on    to an editor for further pursuit. In this scenario, Watson    would be providing a well-sourced tip.  <\/p>\n<p>    Adapting Watson to new domains isnt easy. According to    a paper from IBM Research that describes    the application of Watson in health care, the system has to be    able to parse and understand the format of a variety of    domain-specific documents. Then it needs to be re-trained so    that it learns how to weigh different sources of evidence, and    any special-purpose taxonomies or logic that drive the domain    also need to be accessible to the system. For investigative    journalism, documents might include interview transcripts,    legal codes and statutes, social networks, other news articles,    PDFs from the Freedom of Information Act (FOIA), or even    requests or document-dumps from sources like WikiLeaks. Through    an iterative process, the system would have to be trained,    going back and forth with editors as it suggested stories and    was told yay or nay, each new vote modulating how the    system weighs and integrates evidence.  <\/p>\n<p>    Given a lot of re-engineering for Watson, how might an acumen    for investigative reporting play out in a real-world news    scenario? Earlier this year the International Consortium of    Investigative Journalists (ICIJ) published a    database of 2.5 million leaked documents about the offshore    holdings and accounts of more than 100,000 entities, including    emails, PDFs, spreadsheets, images, and four large databases    packed with information about offshore companies, trusts,    intermediaries, and other individuals involved with those    companies. Undaunted, it took 112 reporters 15 months to    analyze the data--a lot of human time and effort.  <\/p>\n<p>    For Watson, ingesting all 2.5 million unstructured documents is    the easy part. For this, it would extract references to    real-world entities, like corporations and people, and start    looking for relationships between them, essentially building up    context around each entity. This could be connected out to    open-entity databases like Freebase, to provide even more context. A    journalist might orient the systems attention by indicating    which politicians or tax-dodging tycoons might be of most    interest. Other texts, like relevant legal codes in the target    jurisdiction or news reports mentioning the entities of    interest, could also be ingested and parsed.  <\/p>\n<p>    Watson would then draw on its domain-adapted logic to generate    evidence, like IF corporation A is associated with offshore    tax-free account B, AND the owner of corporation A is married    to an executive of corporation C, THEN add a tiny bit of    inference of tax evasion by corporation C. There would be many    of these types of rules, perhaps hundreds, and probably written    by the journalists themselves to help the system identify    meaningful and newsworthy relationships. Other rules might be    garnered from common sense reasoning databases, like MITs    ConceptNet. At the end of the day (or    probably just a few seconds later), Watson would spit out 100    leads for reporters to follow. The first step would be to peer    behind those leads to see the relevant evidence, rate its    accuracy, and further train the algorithm. Sure, those    follow-ups might still take months, but it wouldnt be hard to    beat the 15 months the ICIJ took in its investigation.  <\/p>\n<p><!-- Auto Generated --><\/p>\n<p>Go here to read the rest: <\/p>\n<p><a target=\"_blank\" href=\"http:\/\/www.fastcolabs.com\/3021545\/can-artificial-intelligence-like-ibms-watson-do-investigative-journalism?partner=rss\" title=\"Can Artificial Intelligence Like IBM&#39;s Watson Do Investigative Journalism?\">Can Artificial Intelligence Like IBM&#39;s Watson Do Investigative Journalism?<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> Two years ago, the two greatest Jeopardy champions of all time got obliterated by a computer called Watson. It was a great victory for artificial intelligence--the system racked up more than three times the earnings of its next meat-brained competitor. For IBMs Watson, the successor to Deep Blue, which famously defeated chess champion Gary Kasparov, becoming a Jeopardy champion was a modest proof of concept.  <a href=\"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/artificial-intelligence\/can-artificial-intelligence-like-ibms-watson-do-investigative-journalism.php\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"limit_modified_date":"","last_modified_date":"","_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[13],"tags":[],"class_list":["post-95494","post","type-post","status-publish","format-standard","hentry","category-artificial-intelligence"],"modified_by":null,"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/95494"}],"collection":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/comments?post=95494"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/95494\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/media?parent=95494"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/categories?post=95494"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/tags?post=95494"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}