Monthly Archives: January 2013

Is DNA the Future of Data Storage?

Posted: January 25, 2013 at 8:50 am

Getty Images

One night a few years ago, two biologists sat in a bar in Hamburg, discussing DNA. Ewan Birney, the associate director of the European Bioinformatics Institute, and Nick Goldman, a research scientist there, were wondering how to handle the tsunami of data flooding the institute, whose job it is to maintain databases of DNA sequences, protein structures, and other biological information that scientists turn up in their researchdatabases that are growing exponentially, thanks mostly to dropping costs and increased automation. The maintenance of all this data on hard drives was pressing their budget to the breaking point.

Being genomicists, they joked that DNA, which is incredibly compact, sturdy, and of course has a rather lengthy history of storing data, would be a better way to go. Joking, however, gave way to fevered napkin-scribbling, and soon, recalls Goldman, We had to order another beer, and call for more napkins to write on.

Three years later, the results of that bar stool inspiration have been published in Nature, in a paper in which Birney, Goldman and their collaborators report using DNA to store a complete set of Shakespeares sonnets, a PDF of the first paper to describe DNAs double helix structure, a 26-second mp3 clip from Martin Luther King, Jr.s I Have a Dream speech, a text file of a compression algorithm, and a JPEG photograph of the institute. You may not be storing your personal data on DNA anytime soonthe process is time-consuming and expensive, and theres the small matter of needing a DNA sequencer to open the filesbut as the costs of making and sequencing DNA continue to plunge and as computer engineering approaches the limits of just how densely information can be encoded on silicon, such biological data storage be just whats needed for institutes and other organizations with massive archival needs.

(MORE: Whats Holding Energy Tech Back? The Infernal Battery)

To encode files in DNA, Birney and Goldman started by converting text, image, or audio data into binary code. Then, in several steps using software that Goldman wrote, they converted that into A, T, G, or C code, which stand for the four DNA bases. Working from that string of letters, they drew up the blueprints for thousands of pieces of DNA , each containing a snippet of a file, and sent their designs to Agilent Technologies, which manufactures custom DNA for biologists. Agilent sent back the completed DNA fragmentsjust a smidge of white dust in the bottom of a plastic tube, Goldman recalls. To open the files, the team used a standard DNA sequencer, a process that took about 2 weeks. They then used Goldmans software to reassemble the sequenced DNA into coherent, readable files. With the exception of two small gaps in the DNA, the sonnets, photo, speech, PDF, and text file re-emerged from the white dust almost completely unscathed. After the scientists performed a little repair work, all of the informationabout 739 KB worthwas retrieved with 100% accuracy.

The fidelity is impressive, and DNA, when kept in a cold, dry, dark place, can stay intact for thousands of years. But how long would you have to want to store something for this process to be cheaper than using archival magnetic tape, which needs to be replaced every 5 years but is still the current gold standard, thanks to its low power demands compared to hard drives or other storage technologies? Birney and Goldman calculate that if you wanted to put a file in storage today and have it last for at least 600 years, DNA would be cheaper than re-recording the data to fresh magnetic tape every half-decade or so, a process that would have to be repeated 120 times over the six-century span.

(MORE: The Internet of Things: Hardware With a Side of Software)

Goldman speculates that if the price of making and sequencing DNA continues to fall at current rates, commercial services that store data in DNA might spring up around 50 years from now. You would email documents and photographs and stuff that were valuable to you and your family [tothe DNA storage company],and maybe a day later or a week later, they would ship you back a little bit of DNA, says Goldman. You could stick it in the fridge or bury it in the garden or they would store it. And they can guarantee it will be there a hundred thousand years later.

Birney and Goldman are not the only genomicists who have realized the data-storage potential of DNA. In September 2012, genomicists George Church, Yuan Gao, and Sriram Kosuri published a short description of a similar system in Science. The Nature team stored slightly more data, and Goldman avoided one of the sources of error in the Science paperstrings of repeated bases that DNA sequencers have trouble handlingby adjusting the way his software converts the information into A, T, G, and C. But on the whole, the ideas are similar, and represent a big step forward from earlier, smaller studies.

The rest is here:
Is DNA the Future of Data Storage?

Posted in DNA | Comments Off on Is DNA the Future of Data Storage?

Four-stranded DNA discovered

Posted: at 8:50 am

Sixty years after scientists described the chemical code of life an interweaving double helix called DNA researchers have found four-stranded DNA is also lurking in human cells.

The odd structures are called G-quadruplexes because they form in regions of deoxyribonucleic acid (DNA) that are full of guanine, one of the DNA molecule's four building blocks, with the others being adenine, cytosine, thymine. The structure comprises four guanines held together by a type of hydrogen bonding to form a sort of squarelike shape. (The DNA molecule is itself a double strand held together by these building blocks and wrapped together like a helix.)

The new visualization of the G-quadruplex is detailed this week in the journal Nature Chemistry.

Science news from NBCNews.com

As the bitter cold in the northeastern United States keeps even hardy New Hampshire skiers off the slopes, theres at least one potential upside to the cold snap: fewer mosquitoes come summer, according to an entomologist riding out the cold in upstate New York.

"I think this paper is important in showing directly the existence of this structure in vivo in the human genome, but it is not completely unexpected," said Hans-Joachim Lipps, of the University of Witten in Germany, who was not involved in the study. [ See Images of the 4-Stranded DNA ]

Scientists had shown in the past that such quadruplex DNA could form in test tubes and had even been found in the cells of ciliated protozoa, or single-celled organisms with hairlike appendages. Also there were hints of its existence in human cells, though no direct proof, Lipps said.

But scientists still didn't have concrete evidence for its existence in the human genome. In the new study, researchers, including chemist Shankar Balasubramanian, of the University of Cambridge and Cambridge Research Institute, crafted antibody proteins specifically for this type of DNA. The proteins were marked with a fluorescent chemical, so when they hooked up to areas in the human genome packed with G-quadruplexes, they lit up.

Next, they incubated the antibodies with human cells in the lab, finding these structures tended to occur in genes of cells that were rapidly dividing, a telltale feature of cancer cells. They also found a spike in quadruplexes during the s-phase of the cell cycle, or the phase when DNA replicates just before the cell divides.

As such, the researchers think the four-stranded DNA could be a target for personalized medicine in the future. If they could block these odd ducks perhaps they could stop the rapid cell division of cancer cells.

Excerpt from:
Four-stranded DNA discovered

Posted in DNA | Comments Off on Four-stranded DNA discovered

How Shakespeare and MLK Got Encoded in DNA

Posted: at 8:50 am

Here's how the process, outlined yesterday on the website of leading scientific journal Nature,works: The scientists took these writers' famous words, encrypted them using a cipher that corresponds with DNA's four nucleic acids (A, C, G, or T),synthesizedstrands of DNA according to that code, and chilled the resulting samples in dark, dry conditions, where they should last for millennia. Goldman tells NPR's Adam Cole that one of our generation's biggest problemsorganizing and storing the deluge of data we face every daycould be solved using DNA:

The data we're being asked to be guardians of is growing exponentially. But our budgets are not growing exponentially ... We realized that DNA itself is a really efficient way of storing information.

This process shrinks information much more than existing formats like hard drives or magnetic tape. Or paper-bound books. Let's consider that a physical copy of Shakespeare's Sonnetsfromthe Folger Shakespeare Library weighs7 ounces. Project Gutenberg's digital version ofthe poemstakes up 95 KB on your Kindle. That might seem pretty compact, but physical books and e-books are majorly inefficient storage methods when contrasted with genetic encoding.Shall we compare these to a strand of DNA? Goldman's teamshowed that they can fit the entire database of pioneering particle physics lab CERN (which holds approximately 90 petabytes of information) onto just 41 grams of DNA. In comparison, every sonnet Shakespeare ever wrote could fit on a mere speck of genetic material.

RELATED: Personal Genomes Could Soon Be Public Information

These findings aren't necessarily newHarvard geneticist George Church was able to encode a book in DNA last summer. And some adventurous poets are even using DNA to encode new original works. In Canadian poet ChristianBk'sfour-line Xenotext, the stanza "Any style of life / is prim"is encodedin DNA thatalwaysspits out proteins reading "The faery is rosy / of glow." But even Church acknowledges the strides made by Goldman and his colleagues. "I think its a really important milestone," he toldNature's Ed Yong. Currently, storing information in DNA is expensive. It costs about$12,400 to store every megabyte, and $220 to extract the information in readable form. But the expense is going down every year. "In 10 years, it's probably going to be about 100 times cheaper," Goldman told The Wall Street Journal's Gautam Naik. "At that time, it probably becomes economically viable."

Read more from the original source:
How Shakespeare and MLK Got Encoded in DNA

Posted in DNA | Comments Off on How Shakespeare and MLK Got Encoded in DNA

New Genetic Twist: 4-Stranded DNA Lurks in Human Cells

Posted: at 8:50 am

Sixty years after scientists described the chemical code of life an interweaving double helix called DNA researchers have found four-stranded DNA is also lurking in human cells.

The odd structures are called G-quadruplexes because they form in regions of deoxyribonucleic acid (DNA) that are full of guanine, one of the DNA molecule's four building blocks, with the others being adenine, cytosine, thymine. The structure comprises four guanines held together by a type of hydrogen bonding to form a sort of squarelike shape. (The DNA molecule is itself a double strand held together by these building blocks and wrapped together like a helix.)

The new visualization of the G-quadruplex is detailed this week in the journal Nature Chemistry.

"I think this paper is important in showing directly the existence of this structure in vivo in the human genome, but it is not completely unexpected," said Hans-Joachim Lipps, of the University of Witten in Germany, who was not involved in the study. [See Images of the 4-Stranded DNA]

Scientists had shown in the past that such quadruplex DNA could form in test tubes and had even been found in the cells of ciliated protozoa, or single-celled organisms with hairlike appendages. Also there were hints of its existence in human cells, though no direct proof, Lipps said.

But scientists still didn't have concrete evidence for its existence in the human genome. In the new study, researchers, including chemist Shankar Balasubramanian, of the University of Cambridge and Cambridge Research Institute, crafted antibody proteins specifically for this type of DNA. The proteins were marked with a fluorescent chemical, so when they hooked up to areas in the human genome packed with G-quadruplexes, they lit up.

Next, they incubated the antibodies with human cells in the lab, finding these structures tended to occur in genes of cells that were rapidly dividing, a telltale feature of cancer cells. They also found a spike in quadruplexes during the s-phase of the cell cycle, or the phase when DNA replicates just before the cell divides.

As such, the researchers think the four-stranded DNA could be a target for personalized medicine in the future. If they could block these odd ducks perhaps they could stop the rapid cell division of cancer cells.

"We are seeing links between trapping the quadruplexes with molecules and the ability to stop cells dividing, which is hugely exciting," Balasubramanian said in a statement.

The finding "is certainly a technical (not scientific) breakthrough in designing antibodies sensitive enough to demonstrate this structure in vivo in the human genome," Lipps wrote.

See the original post:
New Genetic Twist: 4-Stranded DNA Lurks in Human Cells

Posted in DNA | Comments Off on New Genetic Twist: 4-Stranded DNA Lurks in Human Cells

DNA 'perfect for digital storage'

Posted: at 8:50 am

23 January 2013 Last updated at 13:03 ET By Jonathan Amos Science correspondent, BBC News

Scientists have given another eloquent demonstration of how DNA could be used to archive digital data.

The UK team encoded a scholarly paper, a photo, Shakespeare's sonnets and a portion of Martin Luther King's I Have A Dream speech in artificially produced segments of the "life molecule".

The information was then read back out with 100% accuracy.

It is possible to store huge volumes of data in DNA for thousands of years, the researchers write in Nature magazine.

They acknowledge that the costs involved in synthesizing the molecule in the lab make this type of information storage "breathtakingly expensive" at the moment, but argue that newer, faster technologies will soon make it much more affordable, especially for long-term archiving.

"One of the great properties of DNA is that you don't need any electricity to store it," explained team-member Dr Ewan Birney from the European Bioinformatics Institute (EBI) at Hinxton, near Cambridge.

"If you keep it cold, dry and dark - DNA lasts for a very long time. We know that because we routinely sequence woolly mammoth DNA that is kept by chance in those sorts of conditions." Mammoth remains are many thousands of years old.

The group cites government and historical records as examples of data that could benefit from the molecular storage option.

Much of this information is not required every day but still needs to be kept. Once encoded in DNA, it could be put away safely in a vault until it was needed.

Read the rest here:
DNA 'perfect for digital storage'

Posted in DNA | Comments Off on DNA 'perfect for digital storage'

Data Storage in DNA Becomes a Reality

Posted: at 8:49 am

By Breanna Draxler | January 24, 2013 3:18 pm

Genetic and binary code. Image courtesy of artida / shutterstock

DNA is the building block of life, but in the future it may also be the standard repository for encyclopedias, music and other digital data. Scientists announced yesterday that they successfully converted 739 kilobytes of hard drive data in genetic code and then retrieved the content with 100 percent accuracy.

The researchers began with the computer files from some notable cultural highlights: an audio recording of MLK Jr.s 1963 I Have a Dream speech, all 154 of Shakespeares sonnets, and, appropriately, a copy of Watson and Cricks original research paper describing DNAs double helix structure. On a hard drive, these files are stored as a series of zeros and ones. The researchers worked out a system to translate the binary code into one with four characters instead: A, C, G and T. They used this genetic code to synthesize actual strands of DNA with the content embedded in its very structure.

The ouput was actually pretty unimpressive: just a smidgeon of stuff barely visible at the bottom of a test tube. The wow factor arose when they reversed the process. The researchers sequenced the genome of the data-laden DNA and translated it back into zeros and ones. The result was a re-creation of the original content without a single error, according to the results published in Nature on Wednesday.

So what does DNA offer that other data storage methods dont? One, it can pack data really densely. A single gram of DNA holds more than a million CDs, according to the researchers. Two, DNA lasts a really long time in a range of conditions. It is not nearly as sensitive or fragile as existing data centers. Three, DNA has a reputation for safely storing information: It holds the history of all life on Earth, a tough resum to top.

This is not the first time DNA has been used to store data, but the latest iteration is far more efficient, accurate and scale-able than its predecessors. The method would be especially useful for archives that need to be stored long-term without frequent access, acting as an emergency backup rather than a practical replacement for your flash drive.

The thing holding the technology back at this point is the cost. Sequencing and especially synthesizing the DNA is a pricey process, but like most new technologies, it is getting cheaper fast. The researchers say DNA data storage could be a large-scale solution as soon as 2023.

View original post here:
Data Storage in DNA Becomes a Reality

Posted in DNA | Comments Off on Data Storage in DNA Becomes a Reality

Mutations Found in ‘Junk’ DNA May Be Driving Skin Cancers

Posted: at 8:49 am

Human DNA that researchers once thought served no purpose may play a crucial role in deadly skin cancers, harboring some of the mutations that first appear in tumors and promote the malignancys growth.

Using gene sequencing technology, scientists at the Dana Farber Cancer Institute in Boston found two mutations among 71 percent of melanoma tumors analyzed. The discovery, the first to identify gene mutations in the vast region of DNA that only last year was shown to have a role turning genes on and off, was published yesterday in two studies in the online journal Science Express.

The findings are a result of faster, cheaper technology that can sequence all of a tumors DNA in days. They also prove its worth searching the whole genome, not just genes containing instructions for proteins, said Levi Garraway, the studys senior author and assistant professor of medicine at Harvard Medical School and Dana Farber Cancer Institute.

Historically, people used to call that junk DNA, Garraway said in a telephone interview. We actually didnt believe the finding at first.

The mutations are located in a part of the DNA that controls whether a gene called TERT, or telomerase reverse transcriptase, is switched on. When activated, the TERT gene can make a cell replicate almost endlessly -- a common feature in cancer cells, according to the researchers.

The mutation can be caused by exposure to sunlight, Garraway said.

These are mutations of exactly the sort that UV damage causes, he said. It makes perfect sense that youd see these in melanoma.

Melanomas account for three-quarters of the 12,000 annual deaths from skin cancer in the U.S., according to the American Cancer Society. They often start as moles on the skin with ill- defined borders and can spread to the lymph nodes and other organs, becoming increasingly difficult to treat, according to the U.S. National Institutes of Health.

After discovering the mutation, the researchers hooked a piece of the mutant DNA to another gene that makes a protein. They found that when combined, the mutant DNA increased production of the protein, and presumed it would do the same thing in the TERT genes, potentially causing melanoma.

The genetic mutations may not be limited to melanomas. The researchers said that early evidence suggests they might be common in liver and bladder cancers as well.

Read the original post:
Mutations Found in ‘Junk’ DNA May Be Driving Skin Cancers

Posted in DNA | Comments Off on Mutations Found in ‘Junk’ DNA May Be Driving Skin Cancers

Shakespeare Stored in DNA Files

Posted: at 8:49 am

Floppy disks, jump drives, DNA? Scientists have developed a way to encode music and text files into DNA, the molecules that normally hold the instructions for life.

The new method, described today (Jan. 23) in the journal Nature, is extremely expensive right now, but eventually it could be used to store digital files without electricity for thousands of years. And since DNA is so compact, vast amounts of data could be stored in one test tube, said study author Nick Goldman, a geneticist at the European Bioinformatics Institute in the U.K.

"I've gone from being a skeptic to a believer," said David Haussler, a geneticist and computer scientist at the University of California, Santa Cruz, who was not involved in the study.

And because DNA is the script of life, crucial in medicine, agriculture and other endeavors, human beings will always be pushing for ways to improve the reading and writing of DNA, Haussler told LiveScience. [Genetics by the Numbers: 10 Tantalizing Tales]

The team has even used the method to encode Shakespeare's sonnets.

Data deluge

From floppy disks to CDs to magnetic tapes, the technologies to store, read and write digital data become obsolete rapidly. Digital archives take a lot of space, and the files themselves, even archival magnetic tapes, need to be freshened up or rewritten every few years to prevent degradation.

Goldman and colleague Ewan Birney, also of European Bioinformatics Institute, were discussing this problem over beers one day when they realized that DNA might actually be feasible to store vast amounts of data.

As the discovery of intact woolly mammoth DNA demonstrates, the molecule can last for tens of thousands of years as long as it's stored in a cool, dark place, they said. It doesn't require electricity to maintain, like hard drives do, can include built-in error checking, and it's incredibly compact, Goldman told LiveScience. (Earlier this year, another team demonstrated the feasibility of DNA storage, but stored a tiny amount of data and didn't include error checking.)

Storage solution

Follow this link:
Shakespeare Stored in DNA Files

Posted in DNA | Comments Off on Shakespeare Stored in DNA Files

Public genome databases can leak identity

Posted: at 8:49 am

Public genome data is a significant risk to individuals, according to research led out by Yaniv Elrich, a geneticist at the Whitehead Institute for Biomedical Research.

The team that Elrich led was able to de-anonymise genome data using only public information and careful Internet searches. A little chillingly, individuals could be associated with patrilineal genetic characteristics, even if they werent in the databases. A family members presence in the database can be enough, if theyre related in the male line and carry the same surname.

Working with data published in two public genomic databases, Ysearch and SMGF, Elrich demonstrated the privacy risk by matching chromosome data with 50 individuals, in a paper published in Science (abstract here, full paper available free with registration).

Among the genome data recorded in the databases is a genetic marker called short tandem repeats (for which genetic science hasnt yet identified a specific purpose), which are passed down the male line.

As the paper notes, it had been assumed that listing surnames in the databases didnt place individual identity at risk, since surnames could match thousands of individuals. However, the genome data has become a genealogy tool as well, in databases such as YBase.

DNA sequencing pioneer Dr Craig Venter volunteered as a test subject in the research. With only the relevant DNA sequence, Dr Venters age, and the US state where he lives, Erlich was able to retrieve just two possible records one of which was Dr Venter.

With a known surname, the searches become even more accurate: Combining the recovered surname with additional demographic data can narrow down the identity of the sample originator to just a few individuals, Erlich states in the paper.

Surname inference from personal genomes puts the privacy of current de-identified public data sets at risk, it continues.

In five surname recovery cases, we fully identified the CEU* individuals and their entire families with very high probabilities data release, even of a few markers, from one person can spread through deep genealogical ties and lead to the identification of another person who might have no acquaintance with the person who released his genetic data.

*CEU refers to a particular genetic dataset: multigenerational families of northern and western European ancestry in Utah who had originally had their samples collected by CEPH (Centre dEtude du Polymorphisme Humain).

Read more from the original source:
Public genome databases can leak identity

Posted in Genome | Comments Off on Public genome databases can leak identity

Retrovirus in the human genome is active in pluripotent stem cells

Posted: at 8:49 am

Jan. 23, 2013 A retrovirus called HERV-H, which inserted itself into the human genome millions of years ago, may play an important role in pluripotent stem cells, according to a new study published in the journal Retrovirology by scientists at UMass Medical School. Pluripotent stem cells are capable of generating all tissue types, including blood cells, brain cells and heart cells. The discovery, which may help explain how these cells maintain a state of pluripotency and are able to differentiate into many types of cells, could have profound implications for therapies that would use pluripotent stem cells to treat a range of human diseases.

"What we've observed is that a group of endogenous retroviruses called HERV-H is extremely busy in human embryonic stem cells," said Jeremy Luban, MD, the David L. Freelander Memorial Professor in HIV/AIDS Research, professor of molecular medicine and lead author of the study. "In fact, HERV-H is one of the most abundantly expressed genes in pluripotent stem cells and it isn't found in any other cell types."

In the study, Dr. Luban and colleagues describe how RNA from the HERV-H sequence makes up as much as 2 percent of the total RNA found in pluripotent stem cells. The HERV-H sequence is controlled by the same factors that are used to reprogram skin cells into induced pluripotent stem (iPS) cells, a discovery that garnered the 2012 Nobel Prize in Physiology or Medicine. "In other words, HERV-H is a new marker for pluripotency in humans that has the potential to aid in the development of iPS cells and transform current stem cell technology," said Luban.

When a retrovirus infects a cell, it inserts its own genes into the chromosomal DNA of the host cell. As a result, the host cell treats the viral genome as part of its own DNA sequence and begins making the proteins required to assemble new copies of the virus. And because the retrovirus is now part of the host cell's genome, when the cell divides, the virus is inherited by all daughter cells.

In rare cases, it's believed that retroviruses can infect human sperm or egg cells. If this happens, and if the resulting embryo survives, the retrovirus can become a permanent part of the human genome, and be passed down from generation to generation. Scientists estimate that as much as 8 percent of the human genome may be composed of extinct retroviruses left over from infections that occurred millions of years ago. Yet these sequences of fossilized retrovirus were thought to have no discernible functional value.

"The human genome is filled with retrovirus DNA thought to be no more than fossilized junk," said Luban. "Increasingly, there are indications that these sequences might not be junk. They might play a role in gene expression after all."

An expert in HIV and other retroviruses, Luban and his colleagues were seeking to understand if there was a rationale behind where, in the expansive human genome, retroviruses inserted themselves. Knowing where along the chromosomal DNA retroviruses might attack could potentially lead to the development of drugs that protect against infection; better gene therapy treatments; or novel biomarkers that would predict where a retrovirus would insert itself in the genome, said Luban.

Turning these same techniques on the retrovirus sequences already in the human genome, they discovered a sequence, HERV-H, that appeared to be active. "The sequences weren't making proteins because they had been so disrupted over millions of years, but they were making these long, noncoding RNAs," said Luban.

Specifically, the HERV-H sequence was making abundant amounts of RNA in human embryonic stem cells -- and only stem cells. In total, there are more than 1,000 HERV-H retrovirus genomes scattered throughout the human genome. The Luban lab also found high levels of HERV-H RNA in some iPS cells. Other iPS cells, perhaps those lines that were not sufficiently reprogrammed to pluripotency, had lower levels of the HERV-H RNA, another indication that HERV-H may be an important marker for pluripotency.

Interestingly, the HERV-H genes that were expressed in human pluripotent stem cells are only found in the human and chimpanzee genomes, indicating that HERV-H infected a relatively recent ancestor to humans, said Luban.

Here is the original post:
Retrovirus in the human genome is active in pluripotent stem cells

Posted in Genome | Comments Off on Retrovirus in the human genome is active in pluripotent stem cells