Science | Genetics
An international consortium manages to sequence the 8% of our DNA that was impossible to read when the first draft was published twenty-one years ago
A person’s instruction book, their genome, contains 3.096 million base pairs of chemical letters. Everything is there, from hair color to susceptibility to disease, in a long sequence of DNA spread over 23 pairs of chromosomes in the nucleus of every cell except sex cells. Twenty-one years ago, the first draft of the human genome comprised 92%. Today, the Telomere to Telomere Consortium (T2T), an international group of scientists led by the National Institutes of Health of the United States, presents the remaining 8% in the journal ‘Science’, which until now could not be read and will allow us to delve in the origin of diseases.
Celera Genomics and the Human Genome Project released the first draft of our instruction book in February 2001. According to him, we had about 30,000 genes, considerably less than the 70,000 to 140,000 that had been believed until then. With the passage of time and until yesterday, those confirmed were reduced to 20,465. The first draft has been refined ever since and has served to genetically identify pathologies in patients that were previously undiagnosed and to know that there are at least 4,400 diseases of genetic origin, according to Lluís Montoliu, a researcher at the National Center for Biotechnology (CNB- CSIC) who did not participate in the new study.
Genes – the units of biological inheritance – are spread out among chromosomes, compressed DNA strands shaped like twisted ladders. Each rung of that ladder is made up of a combination of two of the four types of nucleotide bases, known as A, C, G and T. Those chemical letters are the genetic alphabet: they code for genes.
THE INSTRUCTION BOOK OF THE HUMAN BEING
The human body is made up of billions of cells. The nucleus of each of them contains the genome.
Each nucleus contains 22 pairs of autosomal chromosomes and one sex chromosome.
Telomeres and centromeres are the areas of chromosomes that have mapped
Unfolding the chromosome like a ribbon, the structure of DNA is seen as two intertwined strands, each made up of 4 types of subunits called nucleotide bases.
The nucleotide bases (A, G, C, and T) are randomly distributed, and this order is the information contained in the sequence.
DNA fragment that contains the instructions to make one or more proteins.
Genes produce messenger ribonucleic acid (mRNA) that comes out of the cell nucleus and directs the making of proteins.
THE GAP FILLED BY T2T
In 2001, 92% had been mapped
200 million base pairs have been sequenced.
The consortium has added 99 genes to the 20,465 identified so far.
The Y chromosome has not been sequenced because the information does not come from a human being, but from a non-viable embryo that does not contain the combination of the DNA of the father and the mother.
THE INSTRUCTION BOOK OF THE HUMAN BEING
The human body is made up of billions of cells. The nucleus of each of them contains the genome.
Each nucleus contains 22 pairs of autosomal chromosomes and one sex chromosome.
Telomeres and centromeres are the areas of chromosomes that have mapped
Unfolding the chromosome like a ribbon, the structure of DNA is seen as two intertwined strands, each made up of 4 types of subunits called nucleotide bases.
The nucleotide bases (A, G, C, and T) are randomly distributed, and this order is the information contained in the sequence.
DNA fragment that contains the instructions to make one or more proteins.
Genes produce messenger ribonucleic acid (mRNA) that comes out of the cell nucleus and directs the making of proteins.
THE GAP FILLED BY T2T
In 2001, 92% had been mapped
200 million base pairs have been sequenced.
The consortium has added 99 genes to the 20,465 identified so far.
The Y chromosome has not been sequenced because the information does not come from a human being, but from a non-viable embryo that does not contain the combination of the DNA of the father and the mother.
THE INSTRUCTION BOOK OF THE HUMAN BEING
The human body is made up of billions of cells. The nucleus of each of them contains the genome.
Each nucleus contains 22 pairs of autosomal chromosomes and one sex chromosome.
Telomeres and centromeres are the areas of chromosomes that have mapped
The nucleotide bases (A, G, C, and T) are randomly distributed, and this order is the information contained in the sequence.
Unfolding the chromosome like a ribbon, the structure of DNA is seen as two intertwined strands, each made up of 4 types of subunits called nucleotide bases.
DNA fragment that contains the instructions to make one or more proteins.
Genes produce messenger ribonucleic acid (mRNA) that comes out of the cell nucleus and directs the making of proteins.
THE GAP FILLED BY T2T
In 2001, 92% had been mapped
200 million base pairs have been sequenced.
The consortium has added 99 genes to the 20,465 identified so far.
The Y chromosome has not been sequenced because the information does not come from a human being, but from a non-viable embryo that does not contain the combination of the DNA of the father and the mother.
EVOLUTION OF THE GREAT APES
lineage separation moment
(in millions of years)
GENETIC RELATIONSHIP OF MAN WITH OTHER LIVING BEINGS
Of all the animals whose DNA has been deciphered, the chimpanzee is the one that has the greatest similarity to the human.
23 pairs of chromosomes (1 sex pair)
Millions of base pairs of DNA
THE INSTRUCTION BOOK OF THE HUMAN BEING
The human body is made up of billions of cells. The nucleus of each of them contains the genome.
Each nucleus contains 22 pairs of autosomal chromosomes and one sex chromosome.
Telomeres and centromeres are the areas of chromosomes that have mapped
The nucleotide bases (A, G, C, and T) are randomly distributed, and this order is the information contained in the sequence.
Unfolding the chromosome like a ribbon, the structure of DNA is seen as two intertwined strands, each made up of 4 types of subunits called nucleotide bases.
DNA fragment that contains the instructions to make one or more proteins.
Genes produce messenger ribonucleic acid (mRNA) that comes out of the cell nucleus and directs the making of proteins.
THE GAP FILLED BY T2T
In 2001, 92% had been mapped
200 million base pairs have been sequenced.
The consortium has added 99 genes to the 20,465 identified so far.
The Y chromosome has not been sequenced because the information does not come from a human being, but from a non-viable embryo that does not contain the combination of the DNA of the father and the mother.
EVOLUTION OF THE GREAT APES
lineage separation moment
(in millions of years)
GENETIC RELATIONSHIP OF MAN WITH OTHER LIVING BEINGS
Of all the animals whose DNA has been deciphered, the chimpanzee is the one that has the greatest similarity to the human.
23 pairs of chromosomes (1 sex pair)
Millions of base pairs of DNA
DNA reading
Reading a genome is complicated. To do this, scientists cut all of the DNA into chunks of hundreds or thousands of letters. Next, sequencing machines read the letters from each fragment, and scientists try to assemble the pieces in the correct order. The 2001 draft left out of that puzzle 8% of DNA, hard-to-read areas located in the center –centromere– and the ends –telomeres– of chromosomes that have long stretches of repeated sequences, which made scientists not knew where to fit them.
“These are important regions, but difficult to sequence,” acknowledges Megan Dennis, a biochemist at the University of California at Davis and co-author of the version published today by ‘Science’. After sequencing 200 million base pairs in these areas – the equivalent in size to a chromosome – the authors of the new study have added 99 genes to the human catalogue. About 90% of that supplement to the 2001 draft comes from the centromeres, where there are a lot of repeated letters. “We used to say that young geneticists should be warned not to venture into the centromere because they would never get out,” jokes Charles Langley, a biologist at the University of California, Davis. The new version maps that narrow region of each chromosome that separates it into a short and a long arm.
This great advance in the reading of our instruction book has been possible thanks to the development of two techniques that allow large pieces of DNA to be sequenced and facilitate the assembly of the puzzle. One of the techniques –Oxford Nanopore DNA– can read up to a million letters at once, although not with great precision, while another –PacBio HiFi DNA– reads 20,000 letters in one go with almost no errors. “We are seeing chapters that we have never read,” says Evan Eichler, a researcher at the University of Washington.
“We have gained a tremendous understanding of human biology and disease by having about 90% of the human genome, but there were many important aspects that remained hidden, out of sight of science, because we did not have the technology to read those parts. Now we can stand on top of the mountain, see the entire landscape below, and get a complete picture of our genetic heritage,” says David Haussler, director of the Genomics Institute at the University of California, Santa Cruz.
“You might think that with 92% of the genome completed long ago, the remaining 8% wouldn’t add much. But from that 8%, we are now gaining a whole new understanding of how cells divide, allowing us to study a number of diseases that we haven’t been able to get to before,” says Erich D. Jarvis of the University Rockefeller, and co-author of the study that helped develop the sequencing techniques.
The resulting genome is not that of a person. The DNA comes from a cell of “a failed embryo resulting from a rare complication of pregnancy that causes it to lose the genome of one parent and duplicate that of the other,” Montoliu says. The advantage is that, since there are two identical copies of each chromosome –and not different ones–, this facilitates reading. The disadvantage in this case is that the obtained sequence lacks the Y chromosome, the male one. Despite its odd origin, there’s nothing to suggest anything out of the ordinary in the sequence, says Megan Dennis.
www.hoy.es
Eddie is an Australian news reporter with over 9 years in the industry and has published on Forbes and tech crunch.