This is a problem if we want AIs to be trustworthy. Thats why Diffbot takes a different approach. It is building an AI that reads every page on the entire public web, in multiple languages, and extracts as many facts from those pages as it can.
Like GPT-3, Diffbots system learns by vacuuming up vast amounts of human-written text found online. But instead of using that data to train a language model, Diffbot turns what it reads into a series of three-part factoids that relate one thing to another: subject, verb, object.
Pointed at my bio, for example, Diffbot learns that Will Douglas Heaven is a journalist; Will Douglas Heaven works at MIT Technology Review; MIT Technology Review is a media company; and so on. Each of these factoids gets joined up with billions of others in a sprawling, interconnected network of facts. This is known as a knowledge graph.
Knowledge graphs are not new. They have been around for decades, and were a fundamental concept in early AI research. But constructing and maintaining knowledge graphs has typically been done by hand, which is hard. This also stopped Tim Berners-Lee from realizing what he called the semantic web, which would have included information for machines as well as humans, so that bots could book our flights, do our shopping, or give smarter answers to questions than search engines.
A few years ago, Google started using knowledge graphs too. Search for Katy Perry and you will get a box next to the main search results telling you that Katy Perry is an American singer-songwriter with music available on YouTube, Spotify, and Deezer. You can see at a glance that she is married to Orlando Bloom, shes 35 and worth $125 million, and so on. Instead of giving you a list of links to pages about Katy Perry, Google gives you a set of facts about her drawn from its knowledge graph.
But Google only does this for its most popular search terms. Diffbot wants to do it for everything. By fully automating the construction process, Diffbot has been able to build what may be the largest knowledge graph ever.
Alongside Google and Microsoft, it is one of only three US companies that crawl the entire public web. It definitely makes sense to crawl the web, says Victoria Lin, a research scientist at Salesforce who works on natural-language processing and knowledge representation. A lot of human effort can otherwise go into making a large knowledge base. Heiko Paulheim at the University of Mannheim in Germany agrees: Automation is the only way to build large-scale knowledge graphs.
To collect its facts, Diffbots AI reads the web as a human wouldbut much faster. Using a super-charged version of the Chrome browser, the AI views the raw pixels of a web page and uses image-recognition algorithms to categorize the page as one of 20 different types, including video, image, article, event, and discussion thread. It then identifies key elements on the page, such as headline, author, product description, or price, and uses NLP to extract facts from any text.
Every three-part factoid gets added to the knowledge graph. Diffbot extracts facts from pages written in any language, which means that it can answer queries about Katy Perry, say, using facts taken from articles in Chinese or Arabic even if they do not contain the term Katy Perry.
Browsing the web like a human lets the AI see the same facts that we see. It also means it has had to learn to navigate the web like us. The AI must scroll down, switch between tabs, and click away pop-ups. The AI has to play the web like a video game just to experience the pages, says Tung.
Diffbot crawls the web nonstop and rebuilds its knowledge graph every four to five days. According to Tung, the AI adds 100 million to 150 million entities each month as new people pop up online, companies are created, and products are launched. It uses more machine-learning algorithms to fuse new facts with old, creating new connections or overwriting out-of-date ones. Diffbot has to add new hardware to its data center as the knowledge graph grows.
Researchers can access Diffbots knowledge graph for free. But Diffbot also has around 400 paying customers. The search engine DuckDuckGo uses it to generate its own Google-like boxes. Snapchat uses it to extract highlights from news pages. The popular wedding-planner app Zola uses it to help people make wedding lists, pulling in images and prices. NASDAQ, which provides information about the stock market, uses it for financial research.
Adidas and Nike even use it to search the web for counterfeit shoes. A search engine will return a long list of sites that mention Nike trainers. But Diffbot lets these companies look for sites that are actually selling their shoes, rather just talking about them.
For now, these companies must interact with Diffbot using code. But Tung plans to add a natural-language interface. Ultimately, he wants to build what he calls a universal factoid question answering system: an AI that could answer almost anything you asked it, with sources to back up its response.
Tung and Lin agree that this kind of AI cannot be built with language models alone. But better yet would be to combine the technologies, using a language model like GPT-3 to craft a human-like front end for a know-it-all bot.
Still, even an AI that has its facts straight is not necessarily smart. Were not trying to define what intelligence is, or anything like that, says Tung. Were just trying to build something useful.
See the article here:
- Defense Official Calls Artificial Intelligence the New Oil - Department of Defense - October 19th, 2020
- Can Artificial Intelligence Help Students Work Better Together? According to Research, the Answer is Yes. - WPI News - October 19th, 2020
- AI that scans a construction site can spot when things are falling behind - MIT Technology Review - October 19th, 2020
- Artificial intelligence gets real in the OR - Modern Healthcare - October 19th, 2020
- 4 AI Stocks That Will Surge in 2021 as Artificial Intelligence Takes Hold - Investorplace.com - October 19th, 2020
- Pimloc gets $1.8M for its AI-based visual search and redaction tool - TechCrunch - October 19th, 2020
- IoT trends continue to push processing to the edge for artificial intelligence (AI) - Urgent Communications - October 19th, 2020
- Companies Work on AI-Based Sensors, Weapons for Use in Image Processing, Target Identification - ExecutiveBiz - October 19th, 2020
- The grim fate that could be 'worse than extinction' - BBC News - October 19th, 2020
- Facebook to use artificial intelligence in bid to improve renewable energy storage - CNBC - October 19th, 2020
- NVIDIA Releases a $59 Jetson Nano 2GB Kit to Make AI More Accessible to Developers - InfoQ.com - October 19th, 2020
- Top tech trends for 2021: Gartner predicts hyperautomation, AI and more will dominate business technology - TechRepublic - October 19th, 2020
- Go Beyond Artificial Intelligence: Why Your Business Needs Augmented Intelligence - Forbes - October 19th, 2020
- Artificial Intelligence Cold War on the horizon - POLITICO - October 19th, 2020
- Total partners with Google to deploy AI-powered solar energy tool - The Hindu - October 19th, 2020
- The state of AI in 2020: democratization, industrialization, and the way to artificial general intelligence - ZDNet - October 1st, 2020
- Daily AI Roundup: The 5 Coolest Things On Earth Today - AiThority - October 1st, 2020
- AIOps uses AI, automation to boost security - MIT Technology Review - October 1st, 2020
- Turning AI onto itself: AI algorithm detects when medical images will be difficult for radiologists or AI to make an effective diagnosis - PRNewswire - October 1st, 2020
- How AI will revolutionize manufacturing - MIT Technology Review - October 1st, 2020
- Will AI cross the proverbial chasm? Algorithmia resolves the practical pitfalls of machine learning - ZDNet - October 1st, 2020
- AI is for the Birds in a New Computer Science Project | Newsroom - UC Merced University News - October 1st, 2020
- This AI Generates Photos Using Only Text Captions as a Guide - PetaPixel - October 1st, 2020
- 9 Soft Skills Every Employee Will Need In The Age Of Artificial Intelligence (AI) - Forbes - October 1st, 2020
- VMware and Nvidia make the power of AI accessible to every enterprise - SiliconANGLE News - October 1st, 2020
- What investment trends reveal about the global AI landscape - Brookings Institution - October 1st, 2020
- The North America artificial intelligence in healthcare diagnosis market is projected to reach from US$ 1,716.42 million in 2019 to US$ 32,009.61... - October 1st, 2020
- Industry VoicesAI doesn't have to replace doctors to produce better health outcomes - FierceHealthcare - October 1st, 2020
- Inside the Army's futuristic test of its battlefield artificial intelligence in the desert - C4ISRNet - October 1st, 2020
- Admiral Seguros Is The First Spanish Insurer To Use Artificial Intelligence To Assess Vehicle Damage - PRNewswire - October 1st, 2020
- Will artificial intelligence have a conscience? - TechTalks - October 1st, 2020
- Global AI in Asset Management Market By Technology, By Deployment Mode, By Application, By End User, By Region, Industry Analysis and Forecast, 2020 -... - October 1st, 2020
- Why Artificial Intelligence Should Be on the Menu this Season - FSR magazine - October 1st, 2020
- Banner Health is the first to bring AI to stroke care in Phoenix - AZ Big Media - October 1st, 2020
- Artificial Intelligence What it is and why it matters | SAS - September 6th, 2020
- What Is Artificial Intelligence (AI)? | PCMag - September 6th, 2020
- What is AI? Everything you need to know about Artificial ... - September 6th, 2020
- How AI will automate cybersecurity in the post-COVID world - VentureBeat - September 6th, 2020
- 3 Predictions For The Role Of Artificial Intelligence In Art And Design - Forbes - September 6th, 2020
- Diffbot attempts to create smarter AI that can discern between fact and misinformation - The Financial Express - September 6th, 2020
- MQ-9 Reaper Flies With AI Pod That Sifts Through Huge Sums Of Data To Pick Out Targets - The Drive - September 6th, 2020
- The fourth generation of AI is here, and its called Artificial Intuition - The Next Web - September 6th, 2020
- The Impact of Artificial Intelligence on Workspaces - Forbes - September 6th, 2020
- Catalyst of change: Bringing artificial intelligence to the forefront - The Financial Express - September 6th, 2020
- We May Be Losing The Race For AI With China: Bob Work - Breaking Defense - September 6th, 2020
- These students figured out their tests were graded by AI and the easy way to cheat - The Verge - September 6th, 2020
- Artificial intelligence expert moves to Montreal because it's an AI hub - Montreal Gazette - September 6th, 2020
- 3 Ways Artificial Intelligence Is Transforming The Energy Industry - OilPrice.com - September 6th, 2020
- How Artificial Intelligence Will Guide the Future of Agriculture - Growing Produce - September 6th, 2020
- Dentsu's Chief Automation Officer: 'AI Should Be Injected In Every Process' - AdExchanger - September 6th, 2020
- Carrboro startup Tanjo to leverage its AI platform to help with NC's reopening - WRAL Tech Wire - September 6th, 2020
- A voice-over artist asks: Will AI take her job? - WHYY - September 6th, 2020
- Engineer-turned-photographer eyes switch to digital field with AI skills - The Straits Times - September 6th, 2020
- Management AI: Matching AI Models To Business Needs, Unsupervised Learning, Customer Segmentation, And Association - Forbes - September 4th, 2020
- Just what can AI in IT operations accomplish? - TechTarget - September 4th, 2020
- AIs Data Hunger Will Drive Intelligence Collection - Breaking Defense - September 4th, 2020
- How to Fight Discrimination in AI - Harvard Business Review - September 4th, 2020
- AR, VR, Autonomy, Automation, Healthcare: Whats Hot In AI Right Now - Forbes - September 4th, 2020
- Loyal Markets on the FX Market and AI Technology - GlobeNewswire - September 4th, 2020
- Should Human Perception and Artificial Intelligence be Compared? - Analytics Insight - September 4th, 2020
- Healthcare AI: How one hospital system is using technology to adapt to COVID-19 - TechRepublic - September 4th, 2020
- Artificial Intelligence: How realistic is the claim that AI will change our lives? - Bangkok Post - September 4th, 2020
- NASAs impressive new AI can predict when a hurricane intensifies - The Next Web - September 4th, 2020
- A London AI Hub, a Facility Bigger than the Louvre, Are Among the Newest Footprint Expansions in the Life Sciences Industry - BioSpace - September 4th, 2020
- Robotics and AI leaders spearheading the battle with COVID-19 - ShareCafe - September 4th, 2020
- AI enhanced content coming to future Android TVs - Android Authority - September 4th, 2020
- Law and Justice Powered by Artificial Intelligence? It's Already a Reality - JD Supra - September 4th, 2020
- Are China and South Korea quietly dominating AI innovation? - Tech Wire Asia - September 4th, 2020
- Human-centered redistricting automation in the age of AI - Science Magazine - September 4th, 2020
- How AI is being used to socially distance audiences at 'Tenet' and why Netflix is no threat, according to this movie theater chain boss - MarketWatch - September 4th, 2020
- Banks arent as stupid as enterprise AI and fintech entrepreneurs think - TechCrunch - September 4th, 2020
- Building up its AI operations, GSK opens a $13M London hub with plans to woo talent now trekking to Silicon Valley - Endpoints News - September 4th, 2020
- This AI tool helps healthcare workers look after their mental health - The European Sting - September 4th, 2020
- Facebook and NYU use artificial intelligence to make MRI scans four times faster - The Verge - August 18th, 2020
- Too many AI researchers think real-world problems are not relevant - MIT Technology Review - August 18th, 2020
- Reimagining creativity and AI to boost enterprise adoption - TechTarget - August 18th, 2020
- Global AI in Healthcare Diagnosis Market 2020-2027 - AI in Future Epidemic Outbreaks Prediction and Response - ResearchAndMarkets.com - Business Wire - August 18th, 2020
- Want to Teach An AI Novelty? First, Teach It Monopoly. Then Throw Out the Rules. - ScienceBlog.com - August 18th, 2020
- AI bias may worsen COVID-19 health disparities for people of color - Healthcare IT News - August 18th, 2020
- No, AI and Big Data Are Not Going to Win the Next Great Power Competition - The Defense Post - August 18th, 2020