Jef Akst | The Scientist | June 8, 2021
The Human Genome Project was a tour de force that resulted in the first draft human genome sequence in 2000, but it wasn’t actually complete. The work left sequence gaps that genomicist Karen Miga of the University of California, Santa Cruz, calls the “final unknown” in remarks to STAT. In total, about 8 percent of the more than 3-billion-base-pair human genome—mostly repeats that are computationally challenging to assemble—has remained unsequenced in the two decades since that first draft.
Filling in those gaps has “never been done before,” Miga tells STAT, “and the reason it hasn’t been done before is because it’s hard.” But with an international group of collaborators, Miga last month (May 27) posted a preprint that starts to do just that, adding nearly 200 million DNA bases to the known human genome sequence and discovering some 115 potentially protein-coding genes in the process.