Complicated legacies: The human genome at 20

Algorithmic biology unleashed

Hallam Stevens | Science Magazine | February 5, 2021

Millions of people today have access to their personal genomic information. Direct-to-consumer services and integration with other “big data” increasingly commoditize what was rightly celebrated as a singular achievement in February 2001 when the first draft human genomes were published. But such remarkable technical and scientific progress has not been without its share of missteps and growing pains. Science invited the experts below to help explore how we got here and where we should (or ought not) be going. — Brad Wible

Over a few frenzied weeks in the middle of 2000, icing his wrists between coding sessions, Jim Kent, a graduate student at the University of California, Santa Cruz, created a key software tool used in the international effort to sequence the human genome. GigAssembler pieced together the millions of fragments of DNA sequence generated at labs around the globe, literally making

the human genome. At almost the same time, Celera Genomics acquired Paracel, a company that primarily designed software for intelligence gathering. Paracel owned specially designed text matching hardware and software (the TRW Fast Data Finder) that was rapidly adapted for sniffing out genes within the vast spaces of the genome.

Untangling the jumble of genomic letters required rapidly and accurately searching for a specified sequence within a very large space. This demanded new forms of training and disciplinary expertise. Physicists, mathematicians, and computer scientists brought methods such as linear programming, hashing, and hidden Markov models into biology. Since 2005, the Moore’s Law– like growth of next-generation sequencing has generated ever increasing troves of data and required even faster algorithms for indexing and searching. Biology has borrowed “big data” methods from industry (e.g., Hadoop) but has also contributed to pushing the frontiers of computer science research (e.g., the Burrows-Wheeler transform) (12).

[Read more…]