Hardware-enabled biology: From hardware-acceleration to novel biological discoveries

Yatish Turakhia

Stanford University

PhD candidate, Electrical Engineering Department

Abstract

Computer architecture and genomics offer starkly contrasting but synergistic trends. While genome sequencing data continues to rise exponentially (80%/year), transistor performance scaling has considerably slowed down (3%/year). Domain-specific acceleration (DSA), which uses specialized hardware for accelerating a narrow domain (like genomics or machine learning), is one of the few remaining approaches in computer architecture to continue to scale compute performance and efficiency to enable the vast potential of genomics data. But bridging domain-specific acceleration with genomics is challenging. Trying to accelerate an existing algorithm, optimized primarily for software, often provides only 5-10x improvement with hardware specialization. In my research, I had a unique privilege of being co-advised by experts of two domains: (i) hardware and (ii) biology, which allowed us to adopt a “hardware-enabled biology” approach. This approach modifies an existing algorithm in a way that provides massive speedup (1,000-10,000x) in specialized hardware without compromising, and sometimes even enhancing, the results for a biologist.

In the first half of my talk, I will present our work on hardware acceleration of two emerging, compute-intensive applications in genomics — long read assembly (Darwin co-processor) and whole-genome alignments (Darwin-WGA co-processor) along with the DSA and biology lessons we learned from them. The second half of my talk will be more focused on biology, in applying whole-genome alignments to make novel discoveries. I will present an algorithm we developed to map a reference gene to its correct orthologous chain (referring to the alignment chains developed by Kent et al. at UCSC) in a query species and how we used this algorithm to develop a novel pipeline to discover hundreds of genes that are surprisingly lost in different mammals, some of which are considered indispensable genes in human and mouse. I will also present a novel screen for we developed testing molecular basis of convergent evolution in mammals that also relied on our gene-to-chain mapping algorithm. I will conclude by proposing ideas for collaboration based on our shared interests.

Bio & Research Interests

Yatish Turakhia is a 5th year PhD student at Stanford University co-advised by Prof. Bill Dally and Prof. Gill Bejerano. He is interested in designing co-processors for accelerating a wide-range of algorithms involving genomic sequence alignment – from read assembly to remote homology search. He is also interested in developing new algorithms for computational genomics — from comparative genomics to medical diagnosis. He is a recipient of the NVIDIA graduate fellowship and a best paper award at ASPLOS 2018. His work will also feature in IEEE Micro Top Picks (2018) this year.