Dissertation Defense: Enabling comparative genomics at the scale of hundreds of species

August 1, 2019 @ 11:00 am - 12:15 pm PDT

Joel Armstrong, Graduate Student Researcher, BiomolecularEngineerings

Comparing related (homologous) subsequences between genomes from different species gives insight into the function and evolution of the genome. This information is captured in “genome alignments,” which are essential for many comparative genomics analyses. However, most existing methods to create a genome alignment suffer from reference-bias (where only one genome is fully aligned to all others), or ignore duplication events. Though the Cactus genome aligner avoided these restrictions, it could not align more than a few genomes without becoming cost-prohibitive as well as losing accuracy. I developed and refined a “progressive alignment” extension to Cactus to allow it to produce a full alignment in time linear in the number of input genomes while maintaining similar, or often improved, quality. This new method allows Cactus to align hundreds of large vertebrate genomes–enabling comparative genomics at an unprecedented scale. During its development I used Cactus as an essential component of several successful comparative genomics projects. Working closely with the 200 Mammals and Bird 10K projects, I have used Cactus to create an alignment of over 600 bird and mammal genomes, which is by far the largest genome alignment ever created. Finally, I have utilized this alignment to provide a highest-possible-resolution annotation of mammalian and avian evolutionary constraint, using the uniquely large number of taxa to enable the examination of weak effects of purifying selection.

Joel Armstrong is a graduate student in the Haussler and Paten labs studying vertebrate genomics and genome alignment. He is currently working with the Bird 10K and 200 Mammals projects to investigate the evolution of avian and mammalian genomes.

David Haussler
Benedict Paten

To accommodate a disability, please contact Ben Coffey at the UC Santa Cruz Genomics Institute (becoffey@ucsc.edu, 831-459-1477).


