dipper.sources.MPD module

class dipper.sources.MPD.MPD(graph_type, are_bnodes_skolemized, data_release_version=None)

Bases: dipper.sources.Source.Source

From the [MPD](http://phenome.jax.org/) website: This resource is a collaborative standardized collection of measured data on laboratory mouse strains and populations. Includes baseline phenotype data sets as well as studies of drug, diet, disease and aging effect. Also includes protocols, projects and publications, and SNP, variation and gene expression studies.

Here, we pull the data and model the genotypes using GENO and the genotype-to-phenotype associations using the OBAN schema.

MPD provide measurements for particular assays for several strains. Each of these measurements is itself mapped to a MP or VT term as a phenotype. Therefore, we can create a strain-to-phenotype association based on those strains that lie outside of the “normal” range for the given measurements. We can compute the average of the measurements for all strains tested, and then threshold any extreme measurements being beyond some threshold beyond the average.

Our default threshold here, is +/-2 standard deviations beyond the mean.

Because the measurements are made and recorded at the level of a specific sex of each strain, we associate the MP/VT phenotype with the sex-qualified genotype/strain.

MPDDL = 'http://phenomedoc.jax.org/MPD_downloads'
static build_measurement_description(row, localtt)
fetch(is_dl_forced=False)

abstract method to fetch all data from an external resource. this should be overridden by subclasses :return: None

files = {'assay_metadata': {'columns': ['measnum', 'mpdsector', 'projsym', 'varname', 'descrip', 'units', 'method', 'intervention', 'paneldesc', 'datatype', 'sextested', 'nstrainstested', 'ageweeks'], 'file': 'measurements.csv', 'url': 'http://phenomedoc.jax.org/MPD_downloads/measurements.csv'}, 'ontology_mappings': {'columns': ['measnum', 'ont_term', 'descrip'], 'file': 'ontology_mappings.csv', 'url': 'http://phenomedoc.jax.org/MPD_downloads/ontology_mappings.csv'}, 'straininfo': {'columns': ['strainname', 'vendor', 'stocknum', 'panel', 'mpd_strainid', 'straintype', 'n_proj', 'n_snp_datasets', 'mpd_shortname', 'url'], 'file': 'straininfo.csv', 'url': 'http://phenomedoc.jax.org/MPD_downloads/straininfo.csv'}, 'strainmeans': {'columns': ['measnum', 'varname', 'strain', 'strainid', 'sex', 'mean', 'nmice', 'sd', 'sem', 'cv', 'minval', 'maxval', 'zscore'], 'file': 'strainmeans.csv.gz', 'url': 'http://phenomedoc.jax.org/MPD_downloads/strainmeans.csv.gz'}}
getTestSuite()

An abstract method that should be overwritten with tests appropriate for the specific source. :return:

mgd_agent_id = 'MPD:db/q?rtn=people/allinv'
mgd_agent_label = 'Mouse Phenotype Database'
mgd_agent_type = 'foaf:organization'
parse(limit=None)

MPD data is delivered in four separate csv files and one xml file, which we process iteratively and write out as one large graph.

Parameters:limit
Returns:
test_ids = ['MPD:6', 'MPD:849', 'MPD:425', 'MPD:569', 'MPD:10', 'MPD:1002', 'MPD:39', 'MPD:2319']