The Beacon Project is a Global Alliance for Genomics and Health (GA4GH) initiative for the federated discovery of genomic data in biomedical research and clinical applications. Originally implemented as a tool reporting the existence of Single Nucleotide Polymorphisms (SNP) in aggregated genomic data collections, the protocol has evolved towards more complex applications with increased functionality. Implementations of the current Beacon API enable the search for structural variants (e.g. deletions and duplications) and return richer responses (e.g. variant metadata and call counts).
With growing interest from the community in the implementation of the Beacon protocol into resources and workflows, the next major release 2.0 will introduce new features which were considered important by the community: Queries by type: The Beacon will define different sets of attributes for requests and responses depending on the type of query; e.g. a specific request and response to return variants within a region in a chromosome (wildcard/region queries) or to get a list of samples related to a phenotype, provided the required authentication or authorization (see Access levels below). Filters: The next major version of Beacon will include a feature to filter the matched variants by additional conditions on e.g. sample specific or technical information (e.g., associated phenotypes, assay type). Here the utilisation of ontologies will be encouraged, with alternative use of custom vocabularies for local applications. Schema versions: Given that new query types will return differing responses (e.g., variant annotations, pointers to data delivery protocols), a mechanism will be implemented to reference internal or external data schemas that describe the content of the Beacon response (e.g., returning variant information using “GA4GH Variant Representation v1”). Access levels: Beacon administrators will be able to specify the level of access (public, registered or controlled) for each field in the Beacon response and even refine this definition by dataset, if these diverge from the default values. This is applicable also to the query types supported (genomic variants, sample lists, etc…)