dipper.sources.OMIA module

class dipper.sources.OMIA.OMIA(graph_type, are_bnodes_skolemized)

Bases: dipper.sources.Source.Source

This is the parser for the [Online Mendelian Inheritance in Animals (OMIA)](http://www.http://omia.angis.org.au), from which we process inherited disorders, other (single-locus) traits, and genes in >200 animal species (other than human and mouse and rats).

We generate the omia graph to include the following information: * genes * animal taxonomy, and breeds as instances of those taxa

(breeds are akin to “strains” in other taxa)
  • animal diseases, along with species-specific subtypes of those diseases
  • publications (and their mapping to PMIDs, if available)
  • gene-to-phenotype associations (via an anonymous variant-locus
  • breed-to-phenotype associations

We make links between OMIA and OMIM in two ways: 1. mappings between OMIA and OMIM are created as OMIA –> hasdbXref OMIM 2. mappings between a breed and OMIA disease are created

to be a model for the mapped OMIM disease, IF AND ONLY IF it is a 1:1 mapping. there are some 1:many mappings, and these often happen if the OMIM item is a gene.

Because many of these species are not covered in the PANTHER orthology datafiles, we also pull any orthology relationships from the gene_group files from NCBI.

clean_up_omim_genes()
fetch(is_dl_forced=False)
Parameters:is_dl_forced
Returns:
files = {'data': {'url': 'http://compldb.angis.org.au/dumps/omia.xml.gz', 'file': 'omia.xml.gz'}}
getTestSuite()

An abstract method that should be overwritten with tests appropriate for the specific source. :return:

make_breed_id(key)
map_omia_group_category_to_ontology_id(category_num)

Using the category number in the OMIA_groups table, map them to a disease id. This may be superceeded by other MONDO methods.

Platelet disorders will be more specific once https://github.com/obophenotype/human-disease-ontology/issues/46 is fulfilled.

Parameters:category_num
Returns:
parse(limit=None)

abstract method to parse all data from an external resource, that was fetched in fetch() this should be overridden by subclasses :return: None

process_associations(limit)

Loop through the xml file and process the article-breed, article-phene, breed-phene, phene-gene associations, and the external links to LIDA.

Parameters:limit
Returns:
process_classes(limit)

Loop through the xml file and process the articles, breed, genes, phenes, and phenotype-grouping classes. We add elements to the graph, and store the id-to-label in the label_hash dict, along with the internal key-to-external id in the id_hash dict. The latter are referenced in the association processing functions.

Parameters:limit
Returns:
process_species(limit)

Loop through the xml file and process the species. We add elements to the graph, and store the id-to-label in the label_hash dict. :param limit: :return:

scrub()

The XML file seems to have mixed-encoding; we scrub out the control characters from the file for processing.

i.e.?i omia.xml:1555328.28: PCDATA invalid Char value 2 <field name=”journal”>Bulletin et Memoires de la Societe Centrale de Medic

Returns:
write_molgen_report()