The GA4GH Discovery Work Stream builds and coordinates standards for federated, secured networks of data and services, to form an “Internet of Genomics” for enabling data discovery and retrieval in health research and clinical genomics.
The Discovery Work Stream is lead by Marc Fiume and Michael Baudis. For details on how this Work Stream operates please read the Discovery Work Stream Organizational Structure & Vision document.
This group meets at a high-level monthly. In addition, the sub-groups listed below meet on their own schedules. Participation in these groups is open but requires adherence to the GA4GH Standards for Professional Conduct.
Product development in GA4GH follows a process outlined in a GA4GH Product Approval Process Guide, in draft. Products developed by the work stream undergo an initial investigation phase, followed by a formal Proposed Product Phase, in which most of the work is done, followed by an formal Approval Phase during which the products gain GA4GH Approval. The formal steps require the approval of the Work Stream leads.
The following products are currently under development for this Work Stream.
A Beacon is a federated, web-accessible service that can be queried for information about a specific genomic variant, e.g. a single nucleotide polymorphism (SNP/SNV) or a copy number variation (CNV), and reports about its existence in the queried resources. Current versions of the Beacon protocol support different usage scenarios and offer the opportunity to link to the matched data using e.g. a handover protocol.
The Beacon API is undergoing a major overhaul for its version 2 which is expected to be launched in late 2021. Updated information can be found on the project website and in the various v2 labeled repositories in its Github space.
Data Connect is an upcoming standard for searching of biomedical data with support for federation across organizational boundaries.
Networks group works on a collection of standards enabling discovery of services. This currently includes 2 GA4GH approved standards:
SchemaBlocks represents a cross work stream, cross driver project initiative to document object standards and prototypes, as well as common data formats and semantics used throughout the GA4GH ecosystem. While products and implementation may be completely based on SchemaBlocks models, this project does not attempt to develop a rigid, complete schema but rather to provide the object vocabulary and semantics for a large range of developments.
More information about the SchemaBlocks project as well as the current schema can be found on the project’s site at schemablocks.org.