Photohunter | Getty Images
The mission is done – or close enough, anyway.
Scientists sent this message to the world in 2003 when they announced that the human genome had been sequenced, assembled and essentially complete – with a few seemingly small gaps.
In fact, the effort to quantify and identify the genetic code that makes us all human, for which the US government has spent billions of dollars, remains a rough draft and at least 8 percent less than the end.
Some of the largest, most repetitive and complex parts of the DNA puzzle remain in the dark – so far.
Inspired by powerful new sequencing technology, a loose collaborator of about 100 scientists announced Thursday that they would fill in the blanks, complete a single human genome from one end to the other, and open up new, promising lines of research in areas where scientists are roaming. Around in the dark.
Genome sequencing was shared more than a year ago, but the results of a full accounting, which is now being tested and used by researchers around the world, were first published in a peer-reviewed journal on Thursday. Six new articles in the journal Science describe a complete sequencing effort and additional analysis of its effects..
Adam Philippi, a computational biologist at the National Human Genome Research Institute and a leader in recent endeavors, said, “It’s done, and it’s done right, and it’s done through all those levels of verification.” “We can have the key to the evolution of optimistic people and that is what makes us uniquely human.”
This legwork could one day help researchers identify the genetic cause of the disorder, unravel the mystery of turning some cells into cancer, and explain how different groups of people have developed different traits over time, such as the ability to improve at higher altitudes.
“It’s a breakthrough,” said Steve Henikoff, a molecular biologist and professor at the Fred Hutchinson Cancer Research Center and the University of Washington, who was not involved in the project.
From line to page
Assembling a genome is like “taking a book, slicing it back into pieces,” says Megan Dennis, an assistant professor who studies human genetics and genomics at UC Davis Health, who contributed to the sequencing effort.
First, researchers need to cut DNA into smaller pieces. Then, it is processed and read bit by bit.
Fragmented, it’s hard to know where each strand came from, so scientists must “stitch that DNA together in a mathematical way,” Dennis said.
In the 2000s, DNA sequencing technology could only create small pieces of genetic code – about 500 base pairs, or characters, at a time.
But some areas of the human genome are highly repetitive, repeating words almost like the pages of a book.
“Repetitive elements exist in different places. It’s hard to know where they are,” Dennis said. Over the years, scientists have had to leave only those pages – and their understanding of the genome – blank.
In recent years, new technology that creates long readings of DNA has completely changed the game. The new machines can make several thousand base pairs in a single piece.
Progress has allowed researchers to fill in missing parts of the genome.
“Having this technology is unimaginable 20 years ago,” Philippe said. Suddenly, researchers could order and place those iterative parts of the genome in context.
“These sequences have genes … these regions have very important functions.”
An epidemic project
The idea of ending the genome has grown biologically.
A perfectionist at heart, it always engulfed Philippi that the human genome remained incomplete.
About five years ago, he teamed up with Karen Meager, an assistant professor in the Department of Biomolecular Engineering at the University of California, to finish the job.
When they get stuck they come forward for help. The project began to snowball, gathering over a few hundred scientific contributors and using a term describing the end caps of the chromosome, now called the telomere-to-telomere project.
When the epidemic hits, the pace of research is only accelerated, with researchers communicating from the cloudy basement via communication platform slack and zoom calls.
“2020 has been a crazy year for many reasons. It has given us something to focus on,” Filipi said.
Finally, researchers have compiled the entire genetic code for a single version of a genome. That genome – which originated decades ago from cell tissue that contains the genetic information of a single sperm – does not represent a human being who has ever lived because it has only one set of ancestral chromosomes.
The complete code will now form the backbone of new genomic research and become a new, finished reference for comparison.
Theory and practice
The whole genome opens up new avenues for research.
For decades, scientists have been piercing over 92 percent of the available genomes, testing it to find genetic variations that could cause disease.
“We have a good idea of the diversity in the region, but we have no idea about the other 8 percent,” Philippi said.
Now, researchers are re-analyzing their old data against the new reference genome, trying to find new clues from what was missing.
“We’ve identified many more, thousands, if not thousands, of new forms,” Dennis said. “Some of them fall into genes that encode proteins, and some of those genes are medically important, clinically important, and contribute to disease.”
The new genome reference enables further study of how centromeres work.
Centromeres are the middle structures of chromosomes that are filled by repetitive sequences of code and are an integral part of the cell division process. They are historically among the least understandable parts of the genome because they contain so much tedious, dense coding.
“We do not understand the underlying process of centromere evolution,” Henikoff said. “As such information is suddenly coming out in the last one year, we are learning a lot more about centromeres.”
Using the new genome, researchers can better study how centromere proteins combine and what happens when they change or lose function.
“Sentomer’s dysfunction can be a serious driver of cancer,” Henikoff said. So far, “we were interrupted because we did not have a reference order.”
Further study of new-hierarchical parts of the genome may help scientists better understand how humans evolved special traits, such as the larger brain that sent their great ape offspring in a genetically distinct path.
“Our frontal cortex-enlarging genes come from the genes that map these repetitive regions,” said Evan Eichler, a professor in the Department of Genome Sciences at the University of Washington School of Medicine and part of the research team.
Advances in genomic sequencing technology could lead to a resurgence of medical breakthroughs, researchers say.
“I’m excited about what we don’t know and the opportunity to discover,” Mega said.
Philippi says his next goal is to streamline the sequencing process to make it cheaper, more efficient and widely available. He plans to sequence the genetic code with both father and mother chromosomes. Sequencing broadly among people from many backgrounds, he said, would help describe the world’s genetic diversity and important genetic diversity.
He envisions a world where everyone has access to their genetic data, which can help doctors provide unique information about which diseases to monitor or which drugs to prescribe.
“In 10 years, finding a complete, perfectly accurate human genome will be part of the healthcare routine and it will be cheap enough that it will not be a second concern – a lab test for less than $ 1,000,” Filipi said. “You will have the complete genome in your pocket.”