dipper.sources.OMIM module

class dipper.sources.OMIM.OMIM(graph_type, are_bnodes_skolemized)

Bases: dipper.sources.Source.Source

The only anonymously obtainable data from the ftp site is mim2gene. However, more detailed information is available via their API. So, we pull the omim identifiers from their ftp site, then query their API in batchs of 20. Their prescribed rate limits have been mecurial

one per two seconds or four per second, in 2017 November all mention of api rate limits have vanished (save 20 IDs per call if any include is used)

Note this ingest requires an api Key which is not stored in the repo, but in a separate conf.json file.

Processing this source serves two purposes: 1. the creation of the OMIM classes for merging into the disease ontology 2. add annotations such as disease-gene associations

When creating the disease classes, we pull from their REST-api id/label/definition information. Additionally we pull the Orphanet and UMLS mappings (to make equivalent ids). We also pull the phenotypic series annotations as grouping classes.

fetch(is_dl_forced=True)

Get the preconfigured static files. This DOES NOT fetch the individual records via REST…that is handled in the parsing function. (To be refactored.) over riding Source.fetch() calling Source.get_files() :param is_dl_forced: :return:

files = {'all': {'url': 'https://omim.org/static/omim/data/mim2gene.txt', 'file': 'mim2gene.txt', 'clean': 'https://data.omim.org/downloads/'}, 'morbidmap': {'url': 'https://data.omim.org/downloads//morbidmap.txt', 'file': 'morbidmap.txt', 'clean': 'https://data.omim.org/downloads/'}, 'phenotypicSeries': {'url': 'https://omim.org/phenotypicSeriesTitle/all?format=tsv', 'file': 'phenotypic_series_title_all.txt', 'clean': 'https://data.omim.org/downloads/', 'headers': {'User-Agent': 'The Monarch Initiative (https://monarchinitiative.org/; info@monarchinitiative.org)'}}}
getTestSuite()

An abstract method that should be overwritten with tests appropriate for the specific source. :return:

parse(limit=None)

abstract method to parse all data from an external resource, that was fetched in fetch() this should be overridden by subclasses :return: None

process_entries(omimids, transform, included_fields=None, graph=None, limit=None)

Given a list of omim ids, this will use the omim API to fetch the entries, according to the `included_fields` passed as a parameter. If a transformation function is supplied, this will iterate over each entry, and either add the results to the supplied `graph` or will return a set of processed entries that the calling function can further iterate.

If no `included_fields` are provided, this will simply fetch the basic entry from omim, which includes an entry’s: prefix, mimNumber, status, and titles.

Parameters:
  • omimids – the set of omim entry ids to fetch using their API
  • transform – Function to transform each omim entry when looping
  • included_fields – A set of what fields are required to retrieve from the API
  • graph – the graph to add the transformed data into
Returns:

test_ids = [119600, 120160, 157140, 158900, 166220, 168600, 219700, 253250, 305900, 600669, 601278, 602421, 605073, 607822, 102560, 102480, 100678, 102750, 600201, 104200, 105400, 114480, 115300, 121900, 107670, 11600, 126453, 102150, 104000, 107200, 100070, 611742, 611100, 102480]
dipper.sources.OMIM.filter_keep_phenotype_entry_ids(entry, graph=None)
dipper.sources.OMIM.get_omim_id_from_entry(entry)