{"id":402,"date":"2009-11-08T05:11:47","date_gmt":"2009-11-08T05:11:47","guid":{"rendered":"http:\/\/euvolution.com\/futurist-transhuman-news-blog\/?p=402"},"modified":"2009-11-08T05:11:47","modified_gmt":"2009-11-08T05:11:47","slug":"my-opencalais-ruby-client-library","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/artificial-intelligence\/my-opencalais-ruby-client-library.php","title":{"rendered":"My OpenCalais Ruby client library"},"content":{"rendered":"<p>Reuters has a great attitude about openly sharing data and technology. About 8 years ago, I obtained a free license for their 1.2 gigabytes of semantically tagged news corpus text - very useful for automated training of my KBtextmaster system as well as other work.<\/p><p>Reuters has done it again, releasing free access to <a href=\"http:\/\/opencalais.com\/page\/gallery\" target=\"new\">OpenCalias semantic text processing web services<\/a>. If you sign up for a free access key (good for 20,000 uses a day of their web services), then you can use my Ruby client library:<\/p><pre># Copyright Mark Watson 2008. All rights reserved.<br># Can be used under either the Apache 2 or the LGPL licenses.<br><br>require 'simple_http'<br><br>require \"rexml\/document\"<br>include REXML<br><br>require 'pp'<br><br>MY_KEY = ENV[\"OPEN_CALAIS_KEY\"]<br>raise(StandardError,\"Set Open Calais login key in ENV: 'OPEN_CALAIS_KEY'\") if !MY_KEY<br><br>PARAMS = \"&amp;paramsXML=\" + CGI.escape('&lt;c:params xmlns:c=\"http:\/\/s.opencalais.com\/1\/pred\/\" xmlns:rdf=\"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#\"&gt;&lt;c:processingDirectives c:contentType=\"text\/txt\" c:outputFormat=\"xml\/rdf\"&gt;&lt;\/c:processingDirectives&gt;&lt;c:userDirectives c:allowDistribution=\"true\" c:allowSearch=\"true\" c:externalID=\"17cabs901\" c:submitter=\"ABC\"&gt;&lt;\/c:userDirectives&gt;&lt;c:externalMetadata&gt;&lt;\/c:externalMetadata&gt;&lt;\/c:params&gt;')<br><br>class OpenCalaisTaggedText<br>  def initialize text=\"\"<br>    data = \"licenseID=#{MY_KEY}&amp;content=\" + CGI.escape(text)<br>    http = SimpleHttp.new \"http:\/\/api.opencalais.com\/enlighten\/calais.asmx\/Enlighten\"<br>    @response = CGI.unescapeHTML(http.post(data+PARAMS))<br>  end<br>  def get_tags<br>    h = {}<br>    index1 = @response.index('terms of service.--&gt;')<br>    index1 = @response.index('&lt;!--', index1)<br>    index2 = @response.index('--&gt;', index1)<br>    txt = @response[index1+4..index2-1]<br>    lines = txt.split(\"\\n\")<br>    lines.each {|line|<br>      index = line.index(\":\")<br>      h[line[0...index]] = line[index+1..-1].split(',').collect {|x| x.strip} if index<br>    }<br>    h<br>  end <br>  def get_semantic_XML<br>    @response<br>  end<br>  def pp_semantic_XML<br>    Document.new(@response).write($stdout, 0)<br>  end<br>end<\/pre><p>Notice that this code expects an environment variable to be set with your OpenCalais access key - you can just hardwire your key in this code if you want. Here is some sample use:<\/p><pre>tt = OpenCalaisTaggedText.new(\"President George Bush and Tony Blair spoke to Congress\")<br><br>pp \"tags:\", tt.get_tags<br>pp \"Semantic XML:\", tt.get_semantic_XML<br>puts \"Semantic XML pretty printed:\"<br>tt.pp_semantic_XML<\/pre><p>The tags print as:<\/p><pre>\"tags:\"<br>{\"Organization\"=&gt;[\"Congress\"],<br> \"Person\"=&gt;[\"George Bush\", \"Tony Blair\"],<br> \"Relations\"=&gt;[\"PersonPolitical\"]}<\/pre><p>OpenCalais looks like a great service. I am planning on using their service for a technology demo, merging in some of my own semantic text processing tools. I might also use their service for training other machine learning based systems. Reuters will also offer a commercial version with guaranteed service, etc.<\/p><div><img loading=\"lazy\" decoding=\"async\" width=\"1\" height=\"1\" src=\"http:\/\/euvolution.com\/futurist-transhuman-news-blog\/wp-content\/plugins\/wp-o-matic\/cache\/74278_9025880770474050744-5142249402151194044?l=artificial-intelligence-theory.blogspot.com\" style=\"padding-left:10px; padding-right: 10px;\"><\/div>","protected":false},"excerpt":{"rendered":"<p>Reuters has a great attitude about openly sharing data and technology. About 8 years ago, I obtained a free license for their 1.2 gigabytes of semantically tagged news corpus text - very useful for automated training of my KBtextmaster system &hellip; <a href=\"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/artificial-intelligence\/my-opencalais-ruby-client-library.php\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"limit_modified_date":"","last_modified_date":"","_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[13],"tags":[],"class_list":["post-402","post","type-post","status-publish","format-standard","hentry","category-artificial-intelligence"],"modified_by":null,"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/402"}],"collection":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/comments?post=402"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/402\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/media?parent=402"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/categories?post=402"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/tags?post=402"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}