The Global Alliance for Genomics and Health (GA4GH) today announced a new application programming interface (API) that enables DNA data providers and consumers to better share information and work together on a global scale, advancing genome research and its clinical application.
The new open-source tool, GA4GH Genomics API Version 0.5, allows interoperable exchange of information contained in DNA sequence reads across multiple organizations and on multiple platforms.
This is the first in a planned suite of genomics APIs to be developed by the Global Alliance’s Data Working Group.
“This new Genomics API is an exciting step toward interoperability in genomic data. It advances the Global Alliance’s mission of enabling the sharing of genomic and clinical data to improve human health,” said David Haussler, Co-Chair of the Global Alliance’s Data Working Group and Scientific Director of the UC Santa Cruz Genomics Institute.
“Because this new API lets researchers work consistently with genomic data across institutions and platforms, it will help realize the benefits that come from large-scale genomic data sharing, allowing us to find the needle in the haystack for patients with rare diseases,” Haussler said.
Promoting the Global Alliance’s goals of transparency and collaboration, GA4GH Genomics API Version 0.5 uses an open development process to allow the wider bioinformatics community to participate. While the Data Working Group has a core team of active developers, all interested developers from any institution can further engage with this platform by exploring sample apps, building implementations from scratch or from existing samples, or by providing feedback on the API and its documentation. The interface is managed in an open Global Alliance developer site at http://ga4gh.org.
GA4GH Genomics API Version 0.1, an earlier version also developed by members of the Data Working Group, is in use by leading organizations, including the European Bioinformatics Institute (EMBL-EBI), the U.S. National Center for Biotechnology Information (NCBI), Google, Genome Savant, and Harvard Medical School’s Biomedical Cybernetics Laboratory, powering a growing community of applications. As analysis tools adopt the new API, researchers will be able to extend their own infrastructure to utilize cloud resources, such as those available from Amazon Web Services, Google Cloud Platform, and Microsoft Azure.
The GA4GH Genomics API is built on file formats now managed by the GA4GH that were developed over the last five years for large-scale genomic sequencing projects. It features cleaner models, with a modern, easy-to-use data description schema and a web-enabled interface.
“Modern DNA sequencing, when coupled with modern data and cloud technology, can lead to breakthroughs in understanding and improving human health. This new Genomics API is a big step forward,” said David Glazer, co-chair of the Reads Task Team and Engineering Director for Google Cloud Platform and Google Genomics. “Google already supports Version 0.1 of the API, and we’ll be adding support for Version 0.5 soon, as well as continuing to contribute to the Data Working Group.”
“The Global Alliance is breaking new ground in combining genomic sequencing and clinical care. Amazon Web Services is proud to support these efforts, and help in defining new operating models, such as the latest Genomics API,” said Matt Wood, General Manager of Data Science, Amazon Web Services, Inc. “We view these new APIs as a vital component for collaboration and development of next-generation tools that can run cost-effectively at massive scale.”
“Genome sequencing is transitioning from being a powerful research tool to making an enormous impact in clinical diagnostics and care.” Said Dr. Richard Durbin, Acting Head of Computational Genomics at the Wellcome Trust Sanger Institute and leader of the Genome Informatics group. “This API from the Global Alliance Data Working Group will enable genomic data processing to move beyond research file formats into modern computing and data architectures, facilitating controlled data sharing and the effective use of these new technologies for both clinical and research benefit.”
“We are using the Global Alliance’s work to enable apps for the TBResist initiative that bridge from raw sequence data to clinically useful phenotypes,” said Professor Gil Alterovitz, a faculty at the Harvard Medical School and director of the Biomedical Cybernetics Laboratory. “Also, the Substitutable Medical Applications and Reusable Technology (SMART) Genomics platform is using the Global Alliance interface to enable interoperability between electronic medical record information (HL7) and raw genetic sequence information.”
Other Working Groups of the Global Alliance for Genomics and Health are currently identifying best practices to integrate genomic data into clinical practice, reaching agreement on security protocols, and developing a framework to address ethics and regulatory considerations.
 
The Global Alliance for Genomics and Health is an international, non-profit alliance formed to help accelerate the potential of genomic medicine to advance human health. Bringing together over 200 leading institutions working in healthcare, research, disease and patient advocacy, life science, and information technology, partners in the Global Alliance are working together to create a common framework of standards and harmonized approaches to enable the responsible, voluntary, and secure sharing of genomic and clinical data. Learn more at: http://genomicsandhealth.org.