Photo: Andrew Brookes/Corbis
In 2005, next-generation sequencing began to change the field of genetics research. Obtaining a persons entire genome became fast and relatively cheap. Databases of genetic information were growing by the terabyte, and doctors and researchers were in desperate need of a way to efficiently sift through the information for the cause of a particular disorder or for clues to how patients might respond to treatment.
Companies have sprung up over the past five years that are vying to produce the first DNA search engine. All of them have different tacticssome even have their own proprietary databases of genetic informationbut most are working to link enough genetic databases so that users can quickly identify a huge variety of mutations. Most companies also craft search algorithms to supplement the genetic information with relevant biomedical literature. But as in the days of the early Web, before Google reigned supreme, a single company has yet to emerge as the clear winner.
Making a functional search engine is a classic big-data problem, says Michael Gonzalez, the vice president of bioinformatics at one such company, ViaGenetics, which was expected to relaunch its platform in March. Before doctors or researchers can use the data, genomic data must be organized so that humans can read and search it. The first step toward that is to put it in a standard form called the variant call format, or VCF. As raw data, a persons complete sequenced genome would take up about 100 gigabytes, so a database that adds the genomes of even 10 patients per day would quickly get out of hand. But VCF files are more compact, requiring only a few hundred megabytes per genome, which helps researchers find the specific variants they want to search in a fraction of the time. Unlike a fully sequenced genome, VCF files point only to where a persons genetic data deviates from the standardthe genome originally compiled by the Human Genome Project in 2001.
With VCF, sifting the genomes themselves for pinpoint mutations isnt the challenge for search engine companies. Most of these companies are allocating their resources toward efforts to seamlessly compile supplementary information about a specific mutation from other databases across the Web, such as the biomedical research archive PubMed or various troves of electronic medical records. Many of these tools have finely tuned algorithms that prioritize the results by credibility or relevance. You want to be able to pull together the information known about a mutation in that position [of the genome] and quickly make an assessment, says David Mittelman, the chief scientific officer for Tute Genomics, based in Provo, Utah, another company designing a genetic-search engine.
In an effort to expand the information that can be attached to a genome under examination, ViaGenetics, based in Miami Beach, Fla., is making its newly updated platform useful for researchers who want to collaborate across institutions. With ViaGenetics tools, researchers can make their data available to other users, so other people can come across these projects, request access, and form a collaboration, Gonzalez says. It helps people connect the dots between different researchers and institutions. This is especially helpful for smaller labs that may not have very extensive genome databases or for researchers from different universities working to decode the same mutation.
Although the genomic-search industry is now focused on serving scientists, that might not always be the case. Mittelman envisions that Tute Genomics could eventually serve consumers directly. People are already demanding information about their genomes just to understand themselves better, Mittelman says, but most companies dont yet consider the average person to be their primary customer. In order to make that shift, the tool will have to be even more intuitive and user-friendly. Fire-hosing someone with data thats not easy to interpret, or using terminology thats not standardized, has the potential to confuse people, he says. Privacy is also a major concern for the average user; the information that Tute users upload isnt stored permanently, Mittelman says, but users will need extra reassurance if the platform becomes available to the lay public.
And a further evolution of the industry is in the offing. Both ViaGenetics and Tute are hoping to be able to run the entire process in-housefrom the initial DNA sequencing to the presentation of final searchable results to users. The market for analyzing and interpreting genomic data is very fragmented, like the computer industry in the 1990s, where you had to go to separate providers to buy a video card or a motherboard and then try to put it together, Mittelman says. Soon this field will consolidate, as the computer industry did.
This article originally appeared in print as A Google for DNA.
Read more:
The Race to Build a Search Engine for Your DNA
- IOM not webcast today. Why Not? - November 8th, 2009 [November 8th, 2009]
- National Academies skeptical at Best. - November 8th, 2009 [November 8th, 2009]
- Some Confusion Exists - November 8th, 2009 [November 8th, 2009]
- Why DTC Genomics IS Medicine. - November 8th, 2009 [November 8th, 2009]
- First Mari, Now Linda. Who's next? - November 8th, 2009 [November 8th, 2009]
- Is it true? - November 8th, 2009 [November 8th, 2009]
- Re-Reviewing the National Academies - November 8th, 2009 [November 8th, 2009]
- The problem with nonclinicians....... - November 8th, 2009 [November 8th, 2009]
- Crazy Night of Emails to Government - November 8th, 2009 [November 8th, 2009]
- Adrienne Carlson's Personalized Medicine. - November 8th, 2009 [November 8th, 2009]
- Tell Me, How do you feel now? Sherpa's RX - November 8th, 2009 [November 8th, 2009]
- This Just In. 23andMe to go to GPs. I love my readers!! - November 8th, 2009 [November 8th, 2009]
- Sorry so long away - November 8th, 2009 [November 8th, 2009]
- 2D6 Rears its ugly head..... - November 8th, 2009 [November 8th, 2009]
- Ok, Fine, Back to Plavix - November 8th, 2009 [November 8th, 2009]
- Kaiser a protoype for Collins' Aim - November 8th, 2009 [November 8th, 2009]
- A few months late to the party.... - November 8th, 2009 [November 8th, 2009]
- Stated Another Way....... - November 8th, 2009 [November 8th, 2009]
- Excuse Me? Harvard and Navigenics? WTF? - November 8th, 2009 [November 8th, 2009]
- Follow up to Yesterday's WTF? Harvard, Navi? and Pfizer??? - November 8th, 2009 [November 8th, 2009]
- Did you get your kit? Thanks Dr. Rob from MedCo - November 8th, 2009 [November 8th, 2009]
- Gluco...Wha? Parkinson's Disease and Glucocerebrosidase mutations. - November 8th, 2009 [November 8th, 2009]
- Away and now back, What did I miss???? 23andme layoffs? Selling Genomes for cheap up next! - November 8th, 2009 [November 8th, 2009]
- Change IS Needed. I agree with William, sometimes. - November 8th, 2009 [November 8th, 2009]
- Good Enough Science? Apparently so at 23andme - November 8th, 2009 [November 8th, 2009]
- Long QT Syndrome, location matters - December 13th, 2009 [December 13th, 2009]
- Congratulations Generation Health. Nice pick up! - December 13th, 2009 [December 13th, 2009]
- An argument 23andSerge can't win...23andme but not medicine - December 13th, 2009 [December 13th, 2009]
- Stop. Breathe. Repeat. An analysis of the direction of DTC Genomics Field. - December 13th, 2009 [December 13th, 2009]
- Hey DTC genomics, Stay Private, Stay Alive, Go Public and Die - December 13th, 2009 [December 13th, 2009]
- You can't have it both way. Either scared your genome is sold off or not. - December 13th, 2009 [December 13th, 2009]
- 15 Days Away Gives Time for Perspective. - December 13th, 2009 [December 13th, 2009]
- What about the SACGHS registry? Another missed opportunity? - December 13th, 2009 [December 13th, 2009]
- AJHG is in and my Favorite Muin is in it! But He Is NOT the Father! - December 13th, 2009 [December 13th, 2009]
- Navigenics for 23andMe prices? - December 18th, 2009 [December 18th, 2009]
- Lp(a) Maybe there's something there that wasn't there before? - December 24th, 2009 [December 24th, 2009]
- Another Year, Another Bankruptcy - December 31st, 2009 [December 31st, 2009]
- 5 Technologies going bye bye in this decade? - January 6th, 2010 [January 6th, 2010]
- Hackers, HITECH and HIPAA in DTC Genomics, Oh My! - January 7th, 2010 [January 7th, 2010]
- Personal Genomics Flop.....big Belly Flop! - January 8th, 2010 [January 8th, 2010]
- Gotta Love It. Even the daycare....... - January 11th, 2010 [January 11th, 2010]
- Congratulations Navigenics. You ARE a clinical lab! Uh-Oh... - January 12th, 2010 [January 12th, 2010]
- CETP, Jewish Centenarians and Alzheimers - January 14th, 2010 [January 14th, 2010]
- Enter the "Not" DTC Genomics Rep - January 17th, 2010 [January 17th, 2010]
- Why Dr. Vanier's Navigenics appointment is good for PM - January 22nd, 2010 [January 22nd, 2010]
- Holy Crap! MedCo Follows in CVS footsteps - February 3rd, 2010 [February 3rd, 2010]
- FDA, Warfarin, still not as sexy to me. - February 5th, 2010 [February 5th, 2010]
- Hype, Hype, Hype from a single study. - February 11th, 2010 [February 11th, 2010]
- I love my readers, even Renata M! - February 17th, 2010 [February 17th, 2010]
- How can insurers use DTC genomics to profile? - February 17th, 2010 [February 17th, 2010]
- 9p21.....ahem. Paynter et.al. Smackdown. Again. - February 18th, 2010 [February 18th, 2010]
- Hey! It's Pete Hulick! Are you Going to GET? - February 19th, 2010 [February 19th, 2010]
- I was wrong......AHEM - February 28th, 2010 [February 28th, 2010]
- G2C2, finally a tool for genomic education! - March 2nd, 2010 [March 2nd, 2010]
- Just 4 million? What 23andMe is worth. - March 5th, 2010 [March 5th, 2010]
- What a difference a year makes - March 9th, 2010 [March 9th, 2010]
- ........DTC Genomic Medicine? - March 12th, 2010 [March 12th, 2010]
- The FDA, 2c19 and the ACC - March 13th, 2010 [March 13th, 2010]
- The problem with Comparative Whole Genomics...... - March 13th, 2010 [March 13th, 2010]
- BRCA testing by 23andME is the same as Myriad Genetics. - March 15th, 2010 [March 15th, 2010]
- The Argument Against DTC Genomics Marketing and such - March 16th, 2010 [March 16th, 2010]
- A moment of Clarity. Some DTCG is not bad. - March 18th, 2010 [March 18th, 2010]
- SNPs for breast cancer risk? It Depends. - March 18th, 2010 [March 18th, 2010]
- How can MDVIP use Navigenics Test for Medicine? - March 18th, 2010 [March 18th, 2010]
- Why did P&G invest in Navigenics? - March 23rd, 2010 [March 23rd, 2010]
- PGx in DTCG? Doesn't stand up to Useful testing. - March 25th, 2010 [March 25th, 2010]
- End of Gene Patents? - March 29th, 2010 [March 29th, 2010]
- Sherpa Accepting Chief Medical Officership - April 3rd, 2010 [April 3rd, 2010]
- The Rumors of My Death........ - April 20th, 2010 [April 20th, 2010]
- Happy DNA Day! - April 25th, 2010 [April 25th, 2010]
- 99 USD, DNA day and patient letters - April 25th, 2010 [April 25th, 2010]
- 2C19, Navigenics and Clinical Reality. - May 1st, 2010 [May 1st, 2010]
- Coriell Personalized Medicine Collaborative rising - May 7th, 2010 [May 7th, 2010]
- Personal Genomes in Clinical Care. Quake paper is a waste! - May 11th, 2010 [May 11th, 2010]
- Personal Genomes in Clinical Care. Quake paper Falls Short! - May 13th, 2010 [May 13th, 2010]
- Last post edited by Drew - May 13th, 2010 [May 13th, 2010]
- GateKeeper? FCUK U! - May 13th, 2010 [May 13th, 2010]
- GateKeeper? F! U! - May 15th, 2010 [May 15th, 2010]
- Potential of genomic medicine, LOST - May 19th, 2010 [May 19th, 2010]
- How Bad Can a House Investigation be for DTC Genomics? - May 20th, 2010 [May 20th, 2010]