{"id":209359,"date":"2017-02-20T00:56:47","date_gmt":"2017-02-20T05:56:47","guid":{"rendered":"http:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/uncategorized\/coders-to-the-rescue-for-nasas-earth-science-data-grist.php"},"modified":"2017-02-20T00:56:47","modified_gmt":"2017-02-20T05:56:47","slug":"coders-to-the-rescue-for-nasas-earth-science-data-grist","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/nasa\/coders-to-the-rescue-for-nasas-earth-science-data-grist.php","title":{"rendered":"Coders to the rescue for NASA&#8217;s Earth science data &#8211; Grist"},"content":{"rendered":"<p><p>    This     story was originally published by Wiredand is reproduced here    as part of the Climate    Desk collaboration.  <\/p>\n<p>    OnFeb. 11, the white stone buildings on UC Berkeleys    campus radiated with unfiltered sunshine. The sky was blue, the    campanile was chiming. But instead of enjoying the beautiful    day, 200 adults had willingly sardined themselves into a    fluorescent-lit room in the bowels of Doe Library to rescue    federal climate data.  <\/p>\n<p>    Like     similar groups across the country  in more than 20 cities     they believe that the Trump administration might want to    disappear this data down a memory hole. So these hackers,    scientists, and students are collecting it to save outside    government servers.  <\/p>\n<p>    But now theyre going even further. Groups like DataRefugeand the    Environmental Data and    Governance Initiative, which organized the Berkeley    hackathon to collect data from     NASAs Earth sciences programs and the Department of    Energy, are doing more than archiving. Diehard coders are    building robust systems to monitor ongoing changes to    government websites. And theyre keeping track of whats been    removed  to learn exactly when the pruning began.  <\/p>\n<p>    The data collection is methodical, mostly. About half the group    immediately sets web crawlers on easily copied government    pages, sending their text to the Internet Archive, a digital    library made up of hundreds of billions of snapshots of    webpages. They tag more data-intensive projects  pages with    lots of links, databases, and interactive graphics  for the    other group. Called baggers, these coders write custom    scripts to scrape complicated data sets from the sprawling,    patched-together federal websites.  <\/p>\n<p>    Its not easy. All these systems were written piecemeal over    the course of 30 years. Theres no coherent philosophy to    providing data on these websites, says Daniel Roesler, chief    technology officer at UtilityAPI and one of the volunteer    guides for the Berkeley bagger group.  <\/p>\n<p>    One coder who goes by Tek ran into a wall trying to download    multi-satellite precipitation data from NASAs Goddard Space    Flight Center. Starting in August, access to Goddard Earth    Science Data required a login. But with a bit of totally legal    digging around the site (DataRefuge prohibits outright    hacking), Tek found a buried link to the old FTP server. He    clicked and started downloading. By the end of the day he had    data for all of 2016 and some of 2015. It would take at least    another 24 hours to finish.  <\/p>\n<p>    The non-coders hit dead-ends too. Throughout the morning they    racked up 404 Page not found errors across NASAs Earth    Observing System website. And they more than once ran across    empty databases, like the Global Change Data Centers reports    archive and one of NASAs atmospheric CO2 datasets.  <\/p>\n<p>    And this is where the real problem lies. They dont know when    or why this data disappeared from the web (or if anyone backed    it up first). Scientists who understand it better will have to    go back and take a look. But in the meantime, DataRefuge and    EDGI understand that they need to be monitoring those changes    and deletions. Thats more work than a human could do.  <\/p>\n<p>    So theyre building software that can do it automatically.  <\/p>\n<p>    Later that afternoon, two dozen or so of the most advanced    software builders gathered around whiteboards, sketching out    tools theyll need. They worked out filters to separate mundane    updates from major shake-ups, and explored blockchain-like    systems to build auditable ledgers of alterations. Basically    its an issue of what engineers call version control  how do    you know if something has changed? How do you know if you have    the latest? How do you keep track of the old stuff?  <\/p>\n<p>    There wasnt enough time for anyone to start actually writing    code, but a handful of volunteers signed on to build out tools.    Thats where DataRefuge and EDGI organizers really envision    their movement going  a vast decentralized network from all 50    states and Canada. Some volunteers can code tracking software    from home. And others can simply archive a little bit every    day.  <\/p>\n<p>    By the end of the day, the group had collectively loaded 8,404    NASA and DOE webpages onto the Internet Archive, effectively    covering the entirety of NASAs Earth science efforts. Theyd    also built backdoors in to download 25 gigabytes from 101    public datasets, and were expecting even more to come in as    scripts on some of the larger datasets (like Teks) finished    running. But even as they celebrated over pints of beer at a    pub on Euclid Street, the mood was somber.  <\/p>\n<p>    There was still so much work to do. Climate change data is    just the tip of the iceberg, says Eric Kansa, an    anthropologist who manages archaeological data archiving for    the nonprofit group Open Context. There are a huge number of    other datasets being threatened with cultural, historical,    sociological information. A panicked friend at the National    Parks Service had tipped him off to a huge data portal that    contains everything from park visitation stats to GIS    boundaries to inventories of species. While he sat at the bar,    his computer ran scripts to pull out a list of everything in    the portal. When its done, hell start working his way through    each quirky dataset.  <\/p>\n<p><!-- Auto Generated --><\/p>\n<p>Go here to read the rest:<\/p>\n<p><a target=\"_blank\" href=\"http:\/\/grist.org\/article\/coders-to-the-rescue-for-nasas-earth-science-data\/\" title=\"Coders to the rescue for NASA's Earth science data - Grist\">Coders to the rescue for NASA's Earth science data - Grist<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> This story was originally published by Wiredand is reproduced here as part of the Climate Desk collaboration. OnFeb <a href=\"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/nasa\/coders-to-the-rescue-for-nasas-earth-science-data-grist.php\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"limit_modified_date":"","last_modified_date":"","_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[20],"tags":[],"class_list":["post-209359","post","type-post","status-publish","format-standard","hentry","category-nasa"],"modified_by":null,"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/209359"}],"collection":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/comments?post=209359"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/209359\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/media?parent=209359"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/categories?post=209359"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/tags?post=209359"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}