Pharmabiz
 

DNA info storage to replace electronic gadgetry

Prof. O. S. ReddiThursday, February 6, 2014, 08:00 Hrs  [IST]

The arrival of computer age has led to the electronic storage and retrieval of information. But with the increase in the volume of information , the cost of storage retrieval has become so costly that it has become highly uneconomical.

The information storage and retrieval systems developed in nature in the living systems is spectacular right from the origin of life and subsequent evolution to humanity. It is surprising why the scientists did not focus on this valuable singular aspect in nature.

With the continued acceleration of information ,the volume has become so enormous that the present electronic gadgetry has to be replaced by a new system. That is why it has led scientists to concentrate on the natural system operational in nature.

It was Dr. EB Baum that initiated work on building an associative memory larger than the brain in 1995. Dr. CT Cleland and others have hidden messages in microdots in 1999. Dr JPL Cox tried on long term data storage in DNA. It was Drs Allenberg and Rostein devised a coding method to have a text, images and music characters in DNA. The aspect of next generation digital information storage in DNA was explained by Drs Church, Gao and Kosuni last year.

Dr. Nick Goldman of the European bioinformatics institute in Britain with his colleagues and Agilent Technologies, California USA decided to use DNA, the genetic material in the living system that is responsible for the code of life as information storage device rather than electronics. Their paper titled “Towards practical, high –capacity, low maintenance information storage in synthesized DNA” was published in the journal “Nature” in 2013.

The current electronic storage devices require active and continuous maintenance regular transferring between storage and media-punched cards to magnetic tapes to flopy desks to CD. At the same time the DNA- based storage needs no costly maintenance except storage in cool, dry and dark place.

The conventional binary code (0 and 1) has been abandoned and instead a ternary code of three numerals 0,1 and 2 using combination of A (Adenine) , G (Guanine) , C (Cytosine) and T (Thymidine) and encoded information is used in DNA. To avoid reading errors especially in repeated based sequences, instead of synthesizing one long string of DNA to code for complete information, they are made in smaller chunks to avoid errors and chunks are read in an appropriate manner to have 100 per cent accuracy. DNA is a long chain involving four alphabets (chemical units) called A, G, C and T that is packed in the nucleus of each in the living system.

DNA has been the primary material in nature since the birth of life on the planet about two billion years ago to store, transfer information in evolution. The size of DNA is small and the entire information content is stored in three billion long sequences of A, G, C and T packed in the nucleus of each cell which is smaller than a micron (1/1000of a millimeter). DNA is stable and has long shelf life. It has been isolated even from the bones of dinosaurs that died 65 million years ago. Though the host is dead , DNA is alive in the bones.

It is a matter of few years from now that the entire electronic storage systems are replaced by the most economical and finest DNA based information storage which is most reliable.

The present global digital information is a peta billion (a million billion) bytes of information where one gram of DNA can store 2.2 peta billion bytes of information. Thus nature has evolved the most economical storage system in life. Life on this planet has taught a lesson to humanity --not to disturb the nature.

Genetic root of major diseases
In 2000 , the big news in the genome field was the race between Dr. J. Craig Venter, founder of the biotech company Celera genomics and the group of scientists from NIH to produce the first rough draft of the human genome sequence.

 Dr. Hobbs and Dr. Cohen quietly embarked on a project to uncover the causes of heart diseases called Dallas Heart Study. Dr Cohen, a South African physiologist studied the cholesterol metabolism (synthesis and breakdown) for several years. Dr. Hobbs trained as an M.D. at Harvard Medical Institute as an investigator at the University of Texas, South Western Medical Centre, at Dallas has done research in the laboratory of Dr. Michael S. Brown and Dr. Joseph. L. Goldstein who shared the Nobel Prize in 1985 for their work on cholesterol metabolism that laid the ground work for the development of cholesterol lowering drugs known as statins.

Dr. Hobbs and Dr. Cohen have set their scientific compass with their new and brilliant idea that is quite different from the scientists working on genomics. They concentrated genomic attention on people with particular dramatic phenotypes , especially High Density Lipoproteins (HDL) often called good cholesterol and Low Density Lipoprotein (LDL) often called as bad cholesterol from a functional stand point. During 2005 Dr. Hobbs and Dr. Cohen turned their attention to people in the Dallas heart study that showed unusually low levels of LDL.

Thus these brilliant researchers hit the genomic jackpot when they analysed the DNA sequences of a gene called PCSK9, known to be involved in cholesterol metabolism. Two mutations that silenced the gene correlated with Low LDL levels. Further studies and analysis of data from populations in Mississippi, North Carolina, Minnesota and Maryland over a peril of 15 years have revealed that African-Americans with one or another silencing mutation in PCSK have a 28 per cent reduction of LDL levels and an astounding heart disease.

In white people a mutation in the same gene reduced LDL by 15 per cent and reduced the risk of heart disease by 47 per cent. Hardly any of the hundreds of genome-wide association studies identified genes with such a large effect on disease risk.

This momentous research work led the drug companies test to shut off the PCSK9 gene or perturb the molecular pathway the gene affects as a way to lower LDL so a to reduce the risk of heart diseases in populations. Now every pharmaceutical company is involved in this work.

With the astounding success of Hobbs – Cohen work , other scientists namely David Goldstein and Elizabeth T Cirulli at Duke also proposed expanding the research for medically important rare variants.

One idea is to sequence and compare the whole exomes in carefully selected people. The exome is the collection of actual protein coding parts of gene (Exons) in chromosomes along with nearby regions that regulate gene activity. It does not include the stretches of DNA that lie between exons or genes.

Studies were made for the rare variants with in families affected by a common disease in people who share an extreme trait where significant DNA differences can be more easily identified. Exome sequencing is only a stop gap strategy till the inexpensive whole genome sequence becomes possible. Traditional genetics may not capture the molecular complexity of genes and their role in diseases. Vast areas of DNA do not code proteins and such areas were named as ‘Junk genes’. But it is known now that the region of ‘junk genes’ code for important regulatory region.

Some of the stretches in DNA produce small bits of RNA that can even interfere with gene expression. Chemical tags or DNA that do not change its sequence are called epigenetic which can influence gene expression and can be modified by environmental factors over the course of life time. This environmentally modified DNA may even be passed to offspring.

Thus the definition of gene is now vexed with multiple layers of complexity. What was thought earlier as a straight forward one way point–to-point relation between genes and traits has know become genotype-phenotype problem, where knowing the protein-coding sequence of DNA tells only part of how a trait comes to be. Animal experiments by Dr. Joseph H Nadear, Director at the institute for systems biology in Seattle has tracked more than 100 biochemical, physiological and behavioural traits that are affected by epigenetic changes and has some of these changed down to four generations that is totally Lamarckian that is acquired characters can be inherited.

He further has experimental evidence that the function of one particular gene sometimes depends on the specific constellation of genetic variants surrounding it leading an ensemble effect that introduces a contextual, post modern wrinkle to genetic explanations of disease.

Some common illnesses may ultimately be traceable to a very large number of genes in a network or pathway whose effects may each vary depending on the gene variants in a person has: the presence of one gene variant, say, can exacerbate or counteract the effect of another disease related gene in the group. His guess of this unconventional kind of inheritance is going to be more common than what we would have expected.

This aspect is yet to be checked. The new generation of fast, cheap, sequencing technologies will soon allow biologists to compare the entire genome by which time common versus rare variants debate subsides. It is a fabulous time to be doing genomics.

Dr. Stephens Hall has been working and reporting on Human Genome Project for more than two decades has clearly stated in his article on “revolution postponed” that Human Genome Project has failed so far to produce the medical miracles that scientists have promised. Medical roots of major diseases have to be resolved by the new revolution.

 
[Close]