[R<~>M] Wildcard Range Matches (proposed)

Instead of querying for a specific genomic variant (e.g. an A instead of a G at position 7577121 on chromosome 17), Beacons could also employ a “range match” concept, in which all or type specific variants mapping to a genomic interval are being identified.

Example wildcard range query using the Beacon+ demonstrator. Here, the DIPG dataset is queried for any reported variant consisting of a single, unspecified alternate base in the transcript region of the EIF4A1 gene.

The concept of performing A promiscuous variant matching approach could e.g. be performed through a combination of

wildcard and range

parameters.

The query would then correspond to “match ANY variants occurring from HERE to THERE”, where “HERE to THERE” could correspond e.g. to the coding region of a gene of interest. Such a query which would potentially match many different variants would be especially powerful in combination with the handover [H->O] concept, in which e.g. all matching variants could be streamed back to the user.

Below is an excerpt of the variant [H->O] object, returned from the combination of biocharacteristics && wildcard base && range query shown in the figure.

Example [H->O] variant delivery

[
  {
    "digest": "DIPG_V_MAF_17_7577121_G_A",
    "callset_id": "DIPG_CS_0386",
    "biosample_id": "DIPG_BS_0386",
    "reference_name": "17",
    "reference_bases": "G",
    "alternate_bases": ["A"],
    "start": [7577121],
    "end": [7577121],
    "genotype": [1, "."]
  },
  {
    "digest": "DIPG_V_MAF_17_7577538_C_T",
    "callset_id": "DIPG_CS_0480",
    "biosample_id": "DIPG_BS_0480",
    "reference_name": "17",
    "reference_bases": "C",
    "alternate_bases": ["T"],
    "start": [7577538],
    "end": [7577538],
    "genotype": [1, "."]
  },
...
]

mbaudis, 2018-11-16