dipper.sources.AnimalQTLdb module

class dipper.sources.AnimalQTLdb.AnimalQTLdb(graph_type, are_bnodes_skolemized, data_release_version=None)

Bases: dipper.sources.Source.Source

The Animal Quantitative Trait Loci (QTL) database (Animal QTLdb) is designed to house publicly all available QTL and single-nucleotide polymorphism/gene association data on livestock animal species. This includes:

  • chicken
  • horse
  • cow
  • sheep
  • rainbow trout
  • pig

While most of the phenotypes here are related to animal husbandry, production, and rearing, integration of these phenotypes with other species may lead to insight for human disease.

Here, we use the QTL genetic maps and their computed genomic locations to create associations between the QTLs and their traits. The traits come in their internal Animal Trait ontology vocabulary, which they further map to [Vertebrate Trait](http://bioportal.bioontology.org/ontologies/VT), Product Trait, and Clinical Measurement Ontology vocabularies.

Since these are only associations to broad locations, we link the traits via “is_marker_for”, since there is no specific causative nature in the association. p-values for the associations are attached to the Association objects. We default to the UCSC build for the genomic coordinates, and make equivalences.

Any genetic position ranges that are <0, we do not include here.

GENEINFO = 'ftp://ftp.ncbi.nih.gov/gene/DATA/GENE_INFO'
GITDIP = 'https://raw.githubusercontent.com/monarch-initiative/dipper/master'
fetch(is_dl_forced=False)

abstract method to fetch all data from an external resource. this should be overridden by subclasses :return: None

files = {'Bos_taurus_info': {'columns': ['tax_id', 'GeneID', 'Symbol', 'LocusTag', 'Synonyms', 'dbXrefs', 'chromosome', 'map_location', 'description', 'type_of_gene', 'Symbol_from_nomenclature_authority', 'Full_name_from_nomenclature_authority', 'Nomenclature_status', 'Other_designations', 'Modification_date', 'Feature_type'], 'file': 'Bos_taurus.gene_info.gz', 'url': 'ftp://ftp.ncbi.nih.gov/gene/DATA/GENE_INFO/Mammalia/Bos_taurus.gene_info.gz'}, 'Equus_caballus_info': {'columns': ['tax_id', 'GeneID', 'Symbol', 'LocusTag', 'Synonyms', 'dbXrefs', 'chromosome', 'map_location', 'description', 'type_of_gene', 'Symbol_from_nomenclature_authority', 'Full_name_from_nomenclature_authority', 'Nomenclature_status', 'Other_designations', 'Modification_date', 'Feature_type'], 'file': 'Equus_caballus.gene_info.gz', 'url': 'https://archive.monarchinitiative.org/DipperCache/Equus_caballus.gene_info.gz'}, 'Gallus_gallus_info': {'columns': ['tax_id', 'GeneID', 'Symbol', 'LocusTag', 'Synonyms', 'dbXrefs', 'chromosome', 'map_location', 'description', 'type_of_gene', 'Symbol_from_nomenclature_authority', 'Full_name_from_nomenclature_authority', 'Nomenclature_status', 'Other_designations', 'Modification_date', 'Feature_type'], 'file': 'Gallus_gallus.gene_info.gz', 'url': 'ftp://ftp.ncbi.nih.gov/gene/DATA/GENE_INFO/Non-mammalian_vertebrates/Gallus_gallus.gene_info.gz'}, 'Oncorhynchus_mykiss_info': {'columns': ['tax_id', 'GeneID', 'Symbol', 'LocusTag', 'Synonyms', 'dbXrefs', 'chromosome', 'map_location', 'description', 'type_of_gene', 'Symbol_from_nomenclature_authority', 'Full_name_from_nomenclature_authority', 'Nomenclature_status', 'Other_designations', 'Modification_date', 'Feature_type'], 'file': 'Oncorhynchus_mykiss.gene_info.gz', 'url': 'https://archive.monarchinitiative.org/DipperCache/Oncorhynchus_mykiss.gene_info.gz'}, 'Ovis_aries_info': {'columns': ['tax_id', 'GeneID', 'Symbol', 'LocusTag', 'Synonyms', 'dbXrefs', 'chromosome', 'map_location', 'description', 'type_of_gene', 'Symbol_from_nomenclature_authority', 'Full_name_from_nomenclature_authority', 'Nomenclature_status', 'Other_designations', 'Modification_date', 'Feature_type'], 'file': 'Ovis_aries.gene_info.gz', 'url': 'https://archive.monarchinitiative.org/DipperCache/Ovis_aries.gene_info.gz'}, 'Sus_scrofa_info': {'columns': ['tax_id', 'GeneID', 'Symbol', 'LocusTag', 'Synonyms', 'dbXrefs', 'chromosome', 'map_location', 'description', 'type_of_gene', 'Symbol_from_nomenclature_authority', 'Full_name_from_nomenclature_authority', 'Nomenclature_status', 'Other_designations', 'Modification_date', 'Feature_type'], 'file': 'Sus_scrofa.gene_info.gz', 'url': 'ftp://ftp.ncbi.nih.gov/gene/DATA/GENE_INFO/Mammalia/Sus_scrofa.gene_info.gz'}, 'cattle_bp': {'columns': ['SEQNAME', 'SOURCE', 'FEATURE', 'START', 'END', 'SCORE', 'STRAND', 'FRAME', 'ATTRIBUTE'], 'curie': 'cattleQTL', 'file': 'QTL_Btau_4.6.gff.txt.gz', 'url': 'https://www.animalgenome.org/QTLdb/tmp/QTL_Btau_4.6.gff.txt.gz'}, 'cattle_cm': {'columns': ['QTL_ID', 'QTL_symbol', 'Trait_name', 'assotype', '(empty)', 'Chromosome', 'Position_cm', 'range_cm', 'FlankMark_A2', 'FlankMark_A1', 'Peak_Mark', 'FlankMark_B1', 'FlankMark_B2', 'Exp_ID', 'Model', 'testbase', 'siglevel', 'LOD_score', 'LS_mean', 'P_values', 'F_Statistics', 'VARIANCE', 'Bayes_value', 'LikelihoodR', 'TRAIT_ID', 'Dom_effect', 'Add_effect', 'PUBMED_ID', 'geneID', 'geneIDsrc', 'geneIDtype'], 'curie': 'cattleQTL', 'file': 'cattle_QTLdata.txt', 'url': 'https://www.animalgenome.org/QTLdb/export/KSUI8GFHOT6/cattle_QTLdata.txt'}, 'chicken_bp': {'columns': ['SEQNAME', 'SOURCE', 'FEATURE', 'START', 'END', 'SCORE', 'STRAND', 'FRAME', 'ATTRIBUTE'], 'curie': 'chickenQTL', 'file': 'QTL_GG_5.0.gff.txt.gz', 'url': 'https://www.animalgenome.org/QTLdb/tmp/QTL_GG_5.0.gff.txt.gz'}, 'chicken_cm': {'columns': ['QTL_ID', 'QTL_symbol', 'Trait_name', 'assotype', '(empty)', 'Chromosome', 'Position_cm', 'range_cm', 'FlankMark_A2', 'FlankMark_A1', 'Peak_Mark', 'FlankMark_B1', 'FlankMark_B2', 'Exp_ID', 'Model', 'testbase', 'siglevel', 'LOD_score', 'LS_mean', 'P_values', 'F_Statistics', 'VARIANCE', 'Bayes_value', 'LikelihoodR', 'TRAIT_ID', 'Dom_effect', 'Add_effect', 'PUBMED_ID', 'geneID', 'geneIDsrc', 'geneIDtype'], 'curie': 'chickenQTL', 'file': 'chicken_QTLdata.txt', 'url': 'https://www.animalgenome.org/QTLdb/export/KSUI8GFHOT6/chicken_QTLdata.txt'}, 'horse_bp': {'columns': ['SEQNAME', 'SOURCE', 'FEATURE', 'START', 'END', 'SCORE', 'STRAND', 'FRAME', 'ATTRIBUTE'], 'curie': 'horseQTL', 'file': 'QTL_EquCab2.0.gff.txt.gz', 'url': 'https://www.animalgenome.org/QTLdb/tmp/QTL_EquCab2.0.gff.txt.gz'}, 'horse_cm': {'columns': ['QTL_ID', 'QTL_symbol', 'Trait_name', 'assotype', '(empty)', 'Chromosome', 'Position_cm', 'range_cm', 'FlankMark_A2', 'FlankMark_A1', 'Peak_Mark', 'FlankMark_B1', 'FlankMark_B2', 'Exp_ID', 'Model', 'testbase', 'siglevel', 'LOD_score', 'LS_mean', 'P_values', 'F_Statistics', 'VARIANCE', 'Bayes_value', 'LikelihoodR', 'TRAIT_ID', 'Dom_effect', 'Add_effect', 'PUBMED_ID', 'geneID', 'geneIDsrc', 'geneIDtype'], 'curie': 'horseQTL', 'file': 'horse_QTLdata.txt', 'url': 'https://www.animalgenome.org/QTLdb/export/KSUI8GFHOT6/horse_QTLdata.txt'}, 'pig_bp': {'columns': ['SEQNAME', 'SOURCE', 'FEATURE', 'START', 'END', 'SCORE', 'STRAND', 'FRAME', 'ATTRIBUTE'], 'curie': 'pigQTL', 'file': 'QTL_SS_11.1.gff.txt.gz', 'url': 'https://www.animalgenome.org/QTLdb/tmp/QTL_SS_11.1.gff.txt.gz'}, 'pig_cm': {'columns': ['QTL_ID', 'QTL_symbol', 'Trait_name', 'assotype', '(empty)', 'Chromosome', 'Position_cm', 'range_cm', 'FlankMark_A2', 'FlankMark_A1', 'Peak_Mark', 'FlankMark_B1', 'FlankMark_B2', 'Exp_ID', 'Model', 'testbase', 'siglevel', 'LOD_score', 'LS_mean', 'P_values', 'F_Statistics', 'VARIANCE', 'Bayes_value', 'LikelihoodR', 'TRAIT_ID', 'Dom_effect', 'Add_effect', 'PUBMED_ID', 'geneID', 'geneIDsrc', 'geneIDtype'], 'curie': 'pigQTL', 'file': 'pig_QTLdata.txt', 'url': 'https://www.animalgenome.org/QTLdb/export/KSUI8GFHOT6/pig_QTLdata.txt'}, 'rainbow_trout_cm': {'columns': ['QTL_ID', 'QTL_symbol', 'Trait_name', 'assotype', '(empty)', 'Chromosome', 'Position_cm', 'range_cm', 'FlankMark_A2', 'FlankMark_A1', 'Peak_Mark', 'FlankMark_B1', 'FlankMark_B2', 'Exp_ID', 'Model', 'testbase', 'siglevel', 'LOD_score', 'LS_mean', 'P_values', 'F_Statistics', 'VARIANCE', 'Bayes_value', 'LikelihoodR', 'TRAIT_ID', 'Dom_effect', 'Add_effect', 'PUBMED_ID', 'geneID', 'geneIDsrc', 'geneIDtype'], 'curie': 'rainbow_troutQTL', 'file': 'rainbow_trout_QTLdata.txt', 'url': 'https://www.animalgenome.org/QTLdb/export/KSUI8GFHOT6/rainbow_trout_QTLdata.txt'}, 'sheep_bp': {'columns': ['SEQNAME', 'SOURCE', 'FEATURE', 'START', 'END', 'SCORE', 'STRAND', 'FRAME', 'ATTRIBUTE'], 'curie': 'sheepQTL', 'file': 'QTL_OAR_4.0.gff.txt.gz', 'url': 'https://www.animalgenome.org/QTLdb/tmp/QTL_OAR_4.0.gff.txt.gz'}, 'sheep_cm': {'columns': ['QTL_ID', 'QTL_symbol', 'Trait_name', 'assotype', '(empty)', 'Chromosome', 'Position_cm', 'range_cm', 'FlankMark_A2', 'FlankMark_A1', 'Peak_Mark', 'FlankMark_B1', 'FlankMark_B2', 'Exp_ID', 'Model', 'testbase', 'siglevel', 'LOD_score', 'LS_mean', 'P_values', 'F_Statistics', 'VARIANCE', 'Bayes_value', 'LikelihoodR', 'TRAIT_ID', 'Dom_effect', 'Add_effect', 'PUBMED_ID', 'geneID', 'geneIDsrc', 'geneIDtype'], 'curie': 'sheepQTL', 'file': 'sheep_QTLdata.txt', 'url': 'https://www.animalgenome.org/QTLdb/export/KSUI8GFHOT6/sheep_QTLdata.txt'}, 'trait_mappings': {'columns': ['VT', 'LPT', 'CMO', 'ATO', 'Species', 'Class', 'Type', 'QTL_Count'], 'file': 'trait_mappings.csv', 'url': 'https://www.animalgenome.org/QTLdb/export/trait_mappings.csv'}}
gene_info_columns = ['tax_id', 'GeneID', 'Symbol', 'LocusTag', 'Synonyms', 'dbXrefs', 'chromosome', 'map_location', 'description', 'type_of_gene', 'Symbol_from_nomenclature_authority', 'Full_name_from_nomenclature_authority', 'Nomenclature_status', 'Other_designations', 'Modification_date', 'Feature_type']
getTestSuite()

An abstract method that should be overwritten with tests appropriate for the specific source. :return:

gff_columns = ['SEQNAME', 'SOURCE', 'FEATURE', 'START', 'END', 'SCORE', 'STRAND', 'FRAME', 'ATTRIBUTE']
parse(limit=None)
Parameters:limit
Returns:
qtl_columns = ['QTL_ID', 'QTL_symbol', 'Trait_name', 'assotype', '(empty)', 'Chromosome', 'Position_cm', 'range_cm', 'FlankMark_A2', 'FlankMark_A1', 'Peak_Mark', 'FlankMark_B1', 'FlankMark_B2', 'Exp_ID', 'Model', 'testbase', 'siglevel', 'LOD_score', 'LS_mean', 'P_values', 'F_Statistics', 'VARIANCE', 'Bayes_value', 'LikelihoodR', 'TRAIT_ID', 'Dom_effect', 'Add_effect', 'PUBMED_ID', 'geneID', 'geneIDsrc', 'geneIDtype']
test_ids = {1795, 1798, 8945, 12532, 14234, 17138, 28483, 29016, 29018, 29385, 31023, 32133}
trait_mapping_columns = ['VT', 'LPT', 'CMO', 'ATO', 'Species', 'Class', 'Type', 'QTL_Count']