Memex In Action: Watch DARPA Artificial Intelligence Search For Crime On The 'Dark Web'

Christopher White heads up the Memex team at DARPA

Of late, DARPA has shown a growing interest in open sourcing its technology, even if its most terrifying creations, like army robot wildcats designed to reach speeds of 50Mph, are understandably kept private. In a weeks time, the wider world will be able to tinker with components of the military research bodys in-development search tool for the dark web. The Memex technology, named after an mechanical mnemonic dreamt up just as the Second World War was coming to a close, has already been put to use by a number of law enforcement agencies, who are looking to counter crime taking place on networks like Tor, where Hidden Services are protected by the privacy-enhancing, encrypted hosting, often for good, often for bad. In its first year, the focus at Memex has been on tracking human trafficking, but the projects scope stretches considerably wider.

Its likely that in the coming weeks many other law enforcement agencies will avail themselves of the search tools, which will land on DARPAs Open Catalog next Friday (though DARPA told FORBES the release could be pushed back to the following Monday). FORBES got an exclusive look at the front end of one of the search technologies created by one of the Memex team, a group of self-proclaimed hackers called Hyperion Gray.

According to Alejandro Caceres, who heads up the Hyperion Gray team, a handful of his firms tools will be available, including advanced web crawling and scraping technologies, with a dose of Artificial Intelligence and machine learning, with the goal of being able to retrieve virtually any content on the Internet in an automated way. Its solution to the problem of finding crime on the so-called dark web (a term anathema to Tors supporters), is called SourcePin. It is trying to overcome one of the main barriers to modern search: crawlers cant click or scroll like humans do and so often dont collect dynamic content that appears upon an action by a user.

Our approach to solving this problem is to build a system that sees the web more like a human user with a browser, and therefore actually behaves like a human user by using a browser to crawl the web, to the point of being able to scroll down a page, or even hover over an object on the page to reveal more content. we are teaching the system how to act like a human and handle virtually any web page scenario. Eventually our system will be like an army of robot interns that can find stuff for you on the web, while you do important things like watch cat videos, says Caceres.

The videos below show the SourcePin front end in action, bringing up a host of Tor-based .onion sites with a tile-based user interface, the latter being a newer version. Clicking on a link brings up more information on the site, which in this case is Euro Guns, described as the number one gun dealer in onionland [another name for Tor], where visitors can buy weapons and ammo in exchange for bitcoin.

There are a host of other big-name partners working on the Memex suite, including Carnegie Mellon, which was handed $3.6 million to develop machine learning algorithms that will analyze ads for sex services posted to websites, in the hope officers will find it easier to search for advertisements related to investigations into sex trafficking and prostitution. National security technology provider Sotera Defense Solutions last week noted it had contributed a browser, DataWake, to the initiative. There are 17 partners in total, most of which have not yet been revealed and the project will last for two more years.

The Memex team also wants to get a better understanding of what Hidden Services are running on Tor. Christopher White, who heads up the Memex team at DARPA, told FORBES previous studies were based on biased data sets, whereas DARPA wants to create a standardised way of counting and exploring the different kinds of Tor-based sites, whether theyre helping human rights activists or drug pushers. Early results have indicated there are as many as 30,000 Hidden Services running at any one time.

A Google Google killer too?

DARPA hasnt yet divulged which components outside of SourcePin will be going open source. White said a fuller toolset will be made available to the wider public in December and when the multi-million dollar initiative is done, he expects many parts of the government and the wider business community to download and adapt what he called a general purpose technology.

Read more:

Memex In Action: Watch DARPA Artificial Intelligence Search For Crime On The 'Dark Web'

Related Posts

Comments are closed.