10 million sequences of COVID-19s genomic code have now been organized into a phylogenetic tree in the UC Santa Cruz SARS-CoV-2 Browser, which is the largest tree of genomic sequences of a single species ever assembled. This accomplishment is impressive for both the computer engineering feat of processing such a massive amount of data and the incredible dedication and coordination of the researchers involved.
It is an astounding thing that has happened there, said Clay Fischer, Project Manager for the UCSC Genome Browser.
All of these sequences are assembled by the researchers into a phylogenetic tree that shows the evolutionary history of the virus, with different branches representing the lineages that have mutated throughout the pandemic. This tree is powered by a software tool called UShER that was developed at the UC Santa Cruz Genomics Institute and is hosted on the UCSC Genome Browser website.
Many hands from around the world have brought the Genomics Institute these 10 million sequences that live on the UShER tree. Clinicians worldwide have administered tests to be sent off to local labs, which then sent the samples on for sequencing. Once they are sequenced, they become digital files that are uploaded to databases for genomic information such as GISAID, GenBank, or the COG-UK database.
Angie Hinrichs, a senior software architect at the UCSC Genome Browser and self-described data wrangler, built a pipeline to pull these sequences into the UShER tree automatically. But this process was complicated as some databases, like GISAID, had restrictions that necessitated the manual download of sequences.
For the first half of 2021, I would download them every night before bed, Hinrichs said.
Hinrichs has worked at the UCSC Genome Browser for twenty years. She keeps a low profile, usually preferring to work behind the scenes than in the spotlight. But according to her colleagues, her work curating the tree of COVID-19 genomes and coordinating with the CDC and other health organizations has been of great importance to the pandemic relief effort. She is a part of the Pango team of volunteers who have been monitoring virus sequences to identify new variants. She takes on the ongoing, daily maintenance of updating and annotating the UShER tree, which recently became the default software used by the Pangolin tool, a system used by health officials worldwide to track the spread of variants in their community.
UShER was created early in the pandemic, when researchers at the UC Santa Cruz Genomics Institute recognized that tracing the evolution of a quickly evolving global pathogen like COVID-19 would require a phylogenetic tree that was able to handle an unprecedented amount of data. So, the Genomics Institutes scientific director David Haussler gathered together a team to focus on pathogen genomics, led by Assistant Professor of Biomolecular Engineering Russell Corbett-Detig and including then-postdoc Yatish Turakhia. Turakhia originally wrote the UShER software, which has the ability to rapidly add a new genome sequence to a very large tree of genome sequences.
Making a tree that can handle so much data is an incredible feat of computer engineering that has required herculean efforts from a number of researchers. Before the current pandemic, phylogenetic trees for comparing viral samples were relatively common, but they were built from comparatively small numbers of sequences.
As unprecedented numbers of SARS-CoV-2 sequences became available, the standard tree-building tools simply could not keep up, and researchers often struggled to make sure their analysis kept pace with the amount of samples they would receive. UShERs software and the sustained effort of the team made it possible to grow the tree apace with the pandemics flood of sequences.
Hinrichs says that her two decades of experience working with the massive amounts of data stored on the UCSC Genome Browser helped prepare her to work with the COVID-19 lineages on UShER.
This data coordination is what makes our resources really powerful, Hinrichs said. We have really great resources here, and really great people.
One of those great resources is UCSCs amazing computing hardware maintained by Jorge Garcia, Haifang Telc, and Erich Weiler. Hinrichs explained that having that computing power has been essential for this project.
Big data is our thing, so we were ready to jump on this, she said.
At the beginning of the pandemic, the UCSC pathogen genomics team made guesses as to how many COVID-19 sequences the tree would need to be able to handle. Only Corbett-Detig thought it would reach a million no one anticipated reaching 10 million.
I still get surprised at how far weve come, Turakhia said. The unimaginable amount of data we were able to handle and the fact that we are able to make sense of it quickly is mind-boggling as a computational genomicist.
As the tree has grown, it has required constant attention and updates. Cheng Ye, an undergrad in Turakhias new lab at UC San Diego, was also able to figure out a way to add new sequences faster when the tree had grown to contain millions of sequences already, and helped develop a tool called MatOptimize that moves sequences around on the tree when more data makes it apparent that the original placement was less optimal.
Accumulating reliable data has been instrumental to better understanding what we are up against in the fight against COVID-19 and all its variants. While little was known about this virus at the start of the pandemic, the tree-building tools developed at UC Santa Cruz have helped to put the history of the virus in some perspective and to predict its future, and researchers across campus have leveraged their expertise to aid in the relief efforts. The progress has been astounding; but for the researchers on the browser team, the urgency of their mission and the sheer amount of data that needs to be curated has also been overwhelming at times. Fischer acknowledges that this level of dedication comes at a cost.
It has been two years of blood, sweat, and tears, he said.
See the original post:
The team behind a tree of 10 million Covid sequences - University of California, Santa Cruz
- The complete plastome sequences of invasive weed Parthenium hysterophorus: genome organization, evolutionary ... - Nature.com - February 18th, 2024 [February 18th, 2024]
- Multi-omic profiling reveals associations between the gut microbiome, host genome and transcriptome in patients with ... - Journal of Translational... - February 18th, 2024 [February 18th, 2024]
- Polymerase Chain Reaction (PCR) - National Human Genome Research Institute - February 18th, 2024 [February 18th, 2024]
- Genomic Time Machine Reveals Secrets of Human DNA - SciTechDaily - February 18th, 2024 [February 18th, 2024]
- 1 Million Unannotated Exons Discovered in the Human Genome - Technology Networks - February 18th, 2024 [February 18th, 2024]
- Hope for the night parrot: bird's full genome has been sequenced - Cosmos - February 18th, 2024 [February 18th, 2024]
- RevIT AAV Enhancer: Rev-up AAV genome production in upstream manufacturing - BioProcess Insider - February 18th, 2024 [February 18th, 2024]
- Multi-omics resources for the Australian southern stuttering frog (Mixophyes australis) reveal assorted antimicrobial ... - Nature.com - February 18th, 2024 [February 18th, 2024]
- Large-scale gene expression alterations introduced by structural variation drive morphotype diversification in Brassica ... - Nature.com - February 18th, 2024 [February 18th, 2024]
- Near-gapless and haplotype-resolved apple genomes provide insights into the genetic basis of rootstock-induced ... - Nature.com - February 18th, 2024 [February 18th, 2024]
- Secrets of Night Parrot unlocked after first genome sequenced - CSIRO - February 18th, 2024 [February 18th, 2024]
- CRISPR gene editing tool gets a revolutionary high-tech upgrade - Earth.com - February 18th, 2024 [February 18th, 2024]
- Ancient retroviruses played a key role in the evolution of vertebrate brains - EurekAlert - February 18th, 2024 [February 18th, 2024]
- Natural selection and genetic diversity maintenance in a parasitic wasp during continuous biological control application - Nature.com - February 18th, 2024 [February 18th, 2024]
- Hopes elusive parrots genome will provide answers - news.com.au - February 18th, 2024 [February 18th, 2024]
- MicroRNA is the master regulator of the genome researchers are learning how to treat disease by harnessing the ... - The Conversation - November 30th, 2023 [November 30th, 2023]
- "Ground-Breaking" Release of World's Largest Whole Genome Resource - Inside Precision Medicine - November 30th, 2023 [November 30th, 2023]
- Pangenome analysis reveals genomic variations associated with domestication traits in broomcorn millet - Nature.com - November 30th, 2023 [November 30th, 2023]
- Global genetic diversity, introgression, and evolutionary adaptation of indicine cattle revealed by whole genome ... - Nature.com - November 30th, 2023 [November 30th, 2023]
- Genome characteristics of atypical porcine pestivirus from abortion cases in Shandong Province, China - Virology Journal - Virology Journal - November 30th, 2023 [November 30th, 2023]
- Correcting modification-mediated errors in nanopore sequencing by nucleotide demodification and reference-based ... - Nature.com - November 30th, 2023 [November 30th, 2023]
- CRISPR-Based "Genome Shredding" Technique Shows Promise in Treating Glioblastoma - Inside Precision Medicine - November 30th, 2023 [November 30th, 2023]
- Genome wide analysis revealed conserved domains involved in the effector discrimination of bacterial type VI secretion ... - Nature.com - November 30th, 2023 [November 30th, 2023]
- TRISH to investigate the effects of spaceflight on the human genome, central nervous system - Odessa American - November 30th, 2023 [November 30th, 2023]
- The venom preceded the stinger: Genomic studies shed light on the origins of bee venom - EurekAlert - November 30th, 2023 [November 30th, 2023]
- Integrating genomic and multiomic data for Angelica sinensis provides insights into the evolution and biosynthesis of ... - Nature.com - November 30th, 2023 [November 30th, 2023]
- Genetic diversity and ancestry of the Khmuic-speaking ethnic groups ... - Nature.com - September 21st, 2023 [September 21st, 2023]
- Researchers to Apply Genome Analysis to Childhood Cancers; Goal ... - The Japan News - September 21st, 2023 [September 21st, 2023]
- How Bats' Genomes May Help Them Avoid Cancer and Survive ... - Technology Networks - September 21st, 2023 [September 21st, 2023]
- Longitudinal genomic surveillance of carriage and transmission of ... - Nature.com - September 21st, 2023 [September 21st, 2023]
- Whole genomes from bacteria collected at diagnostic units around ... - Nature.com - September 21st, 2023 [September 21st, 2023]
- Genome-wide identification of lncRNA & mRNA for T2DM | PGPM - Dove Medical Press - September 21st, 2023 [September 21st, 2023]
- Tasmanian tiger RNA is first to be recovered from an extinct animal - Nature.com - September 21st, 2023 [September 21st, 2023]
- Loneliness and depression: bidirectional mendelian randomization ... - Nature.com - September 21st, 2023 [September 21st, 2023]
- Rome Therapeutics adds $72 million to Series B round to harness ... - OutSourcing-Pharma.com - September 21st, 2023 [September 21st, 2023]
- Mystery of 'living fossil' tree frozen in time for 66 million years finally ... - Livescience.com - September 21st, 2023 [September 21st, 2023]
- Why the human genome could be healthcares holy grail - Yahoo Finance - May 4th, 2023 [May 4th, 2023]
- Scientists Compare Genomes of 240 Mammals to Understand Human DNA - The New York Times - May 4th, 2023 [May 4th, 2023]
- Genomes From 240 Mammalian Species Help Explain 100 Years Of Evolution And Human Disease - ABP Live - May 4th, 2023 [May 4th, 2023]
- 'Deletions' from the human genome may be what made us human - Yale News - May 4th, 2023 [May 4th, 2023]
- GeneDx Adds Buccal Swab as Non-Invasive Whole Genome ... - GlobeNewswire - May 4th, 2023 [May 4th, 2023]
- Whole-genome sequencing used to track down genes behind familial glioma - Medical Xpress - May 4th, 2023 [May 4th, 2023]
- Wiggly proteins guard the genome: Dynamic network in the pores of ... - Science Daily - May 4th, 2023 [May 4th, 2023]
- Genome-Wide Splicing Quantitative Expression Locus Analysis ... - Cancer Discovery - May 4th, 2023 [May 4th, 2023]
- Digital Genome Market is expand at a CAGR of 8.6% to reach USD ... - Digital Journal - May 4th, 2023 [May 4th, 2023]
- High School Students Learn the Basics of Base Editing to Cure GFP ... - University of California San Diego - May 4th, 2023 [May 4th, 2023]
- Genomic researchers gain access to CSIRO's AI-powered data ... - Microsoft - May 4th, 2023 [May 4th, 2023]
- Archaic hominin traits through the splicing lens - Nature.com - May 4th, 2023 [May 4th, 2023]
- Critical bug in genome sequencing device scores '10' on CVSS ratings - SC Media - May 4th, 2023 [May 4th, 2023]
- Novel Genomic Approach Ensures Better Diagnosis of Hereditary ... - Technology Networks - May 4th, 2023 [May 4th, 2023]
- Intellia Therapeutics: Leading the Way in Revolutionary Genome ... - Best Stocks - May 4th, 2023 [May 4th, 2023]
- Visual tracking of viral infection dynamics reveals the synergistic ... - Nature.com - May 4th, 2023 [May 4th, 2023]
- Genome | Genome LLC | United States - March 31st, 2023 [March 31st, 2023]
- Belarus: EU and WHO deliver equipment for research of genomes of infectious disease agents - EIN News - February 24th, 2023 [February 24th, 2023]
- Gene vs. genome: Definition, function, and impact - January 30th, 2023 [January 30th, 2023]
- Big cog in the wheel: As Covid worries reappear, Insacogs genome sequencing ability must be aided by govts - Times of India - December 25th, 2022 [December 25th, 2022]
- CapitalGainsReport Sector Spotlight: Healthcare Penny Stocks On The Move (ARDX, WHSI, BNGO) - Marketscreener.com - November 25th, 2022 [November 25th, 2022]
- Genome Insight and Kun-hee Lee Child Cancer & Rare Disease Project Team of SNUH (Seoul National University Hospital) Made an Agreement About a... - November 23rd, 2022 [November 23rd, 2022]
- Genome-wide association study reveals distinct genetic associations related to leaf hair density in two lineages of wheat-wild relative Aegilops... - October 19th, 2022 [October 19th, 2022]
- The Global Genomics Market to Exhibit Growth at a CAGR of 16.90% During the Forecast Period (20222027) | DelveInsight - Yahoo Finance - October 19th, 2022 [October 19th, 2022]
- Illumina and GenoScreen Partner to Expand Access to Genomic Testing for Multidrug-Resistant Tuberculosis - PR Newswire - October 19th, 2022 [October 19th, 2022]
- Superresolution Method Poised to Better Gene Function Understanding - Photonics.com - October 19th, 2022 [October 19th, 2022]
- Genome-centric analysis of short and long read metagenomes reveals uncharacterized microbiome diversity in Southeast Asians - Nature.com - October 15th, 2022 [October 15th, 2022]
- How a New Battery Data Genome Project Will Use Vast Amounts of Information to Build Better EVs - InsideClimate News - October 15th, 2022 [October 15th, 2022]
- Scientists Reconstruct the Genome of the 180-Million-Year-Old Common Ancestor of All Mammals - SciTechDaily - October 15th, 2022 [October 15th, 2022]
- Combining OSMAC, metabolomic and genomic methods for the production and annotation of halogenated azaphilones and ilicicolins in termite symbiotic... - October 15th, 2022 [October 15th, 2022]
- Concerted expansion and contraction of immune receptor gene repertoires in plant genomes - Nature.com - October 15th, 2022 [October 15th, 2022]
- Uncovering the Full Variant Continuum with Pioneering Solutions from Bionano - Inside Precision Medicine - October 15th, 2022 [October 15th, 2022]
- Metagenomic analysis of viromes in tissues of wild Qinghai vole from the eastern Tibetan Plateau | Scientific Reports - Nature.com - October 15th, 2022 [October 15th, 2022]
- Research Assistant in Molecular and Genome Editing Therapeutics job with KINGS COLLEGE LONDON | 311876 - Times Higher Education - October 15th, 2022 [October 15th, 2022]
- Lessons learnt from COVID-19 shed light on future pandemic preparedness - The Peter Doherty Institute for Infection and Immunity - October 15th, 2022 [October 15th, 2022]
- From Neanderthal genome to Nobel prize: meet geneticist Svante Pbo - Nature.com - October 8th, 2022 [October 8th, 2022]
- Revealing the genome organization of the earliest common ancestor of all mammals - Tech Explorist - October 8th, 2022 [October 8th, 2022]
- Mitochondrial DNA Is Working Its Way Into the Human Genome - Technology Networks - October 8th, 2022 [October 8th, 2022]
- Animated Map: Where to Find Water on Mars - Visual Capitalist - October 8th, 2022 [October 8th, 2022]
- Reconstruction of The First Mammal's Genome Suggests It Had 38 Chromosomes - ScienceAlert - October 6th, 2022 [October 6th, 2022]
- Genomic Science Breakthroughs Are Happening Faster Than Ever Thanks to HPC - CIO - October 6th, 2022 [October 6th, 2022]
- Genome Of Ancient Humans Is The Winning Field Of 2022's Nobel Prize in Medicine - IFLScience - October 6th, 2022 [October 6th, 2022]
- ASU professor to study new genome editing tools with NIH Innovator Award - ASU News Now - October 6th, 2022 [October 6th, 2022]
- New R&D norms to fast-track research on genome-edited crops - The Financial Express - October 6th, 2022 [October 6th, 2022]