Beacon v2 Filters

An v2 extension of the Beacon protocol will allow the query for additional data beyond genome variants, using a proposed filters extension. Such filters are thought to be prefixed attributes, where the (public or private) prefix becomes the basis of scoping the value to the correct database value.

Overview of Beacon filters

The Beacon v2 API supports the discovery of genomics and clinical datasets, and includes a powerful feature to enable the “filtering” of beacon responses by biomedical properties (e.g. phenotypes) and procedural metadata.

Filters belong to one of currently three super-classes:

Using filters in Beacon requests

Beacon filter requests are simple yet flexible, and can be used to query qualitative or quantitative properties. For example, a qualitative phenotype can be represented by a single observation:

Query for individuals with lung cancer (HPO identifier HP:0100526)

By default, the use of Filters in a Beacon request implies that a hierarchical ontology search is requested, whereby the Beacon will query for entities associated with the submitted term and all descendent terms.

Both qualitative and quantitative properties can be represented by attribute + value pairs. Equality and relational operators (= < >) can be used between attributes and values. Additionally, values can be associated with units if applicable:

Query for females (female genotypic sex = PATO:0020002) with lung cancer and over 70 years of age (age = PATO:0000011, age syntax as ISO 8601)

Similar query flexibility is shared by CustomFilters, where attributes and value pairs can be combined, for example:

In the examples above, filters are separated with commas.

Scopes of filters

The use of a filter term can be ambiguous if the entity to which it applies is not specified. For example, the term “metastatic melanoma” (EFO:0002617) could refer to an individual with metastatic melanoma, or a metastatic melanoma biosample.

The entity to which the filter applies can be forced by declaring the scope as a prefix in dot-annotation, for example:

Inferred logical operators between filters

Currently a the logical AND is implied between filters. A limited way of OR type queries can be provided through fuzzy FuzzyFilters.

Listing all Filters and CustomFilters used in a Beacon

All filtering terms used in a Beacon can be listed in order to show the range of biomedical and metadata properties described by the Beacon, and to assist with building requests.

An unordered unique value list containing all Filters and CustomFilters is returned from the /filtering_terms endpoint. For each term, the following information is returned:

Previous design evaluations []

An earlier discussion had proposed a scoped query object design, following the originally proposed GA4GH schema’s object model.

@tb143  @mbaudis 2020-05-12
Edit on Github...