dipper.sources.GeneOntology module¶
-
class
dipper.sources.GeneOntology.
GeneOntology
(graph_type, are_bnodes_skolemized, data_release_version=None, tax_ids=None)¶ Bases:
dipper.sources.Source.Source
This is the parser for the [Gene Ontology Annotations](http://www.geneontology.org), from which we process gene-process/function/subcellular location associations.
We generate the GO graph to include the following information: * genes * gene-process * gene-function * gene-location
We process only a subset of the organisms:
Status: IN PROGRESS / INCOMPLETE
-
fetch
(is_dl_forced=False)¶ abstract method to fetch all data from an external resource. this should be overridden by subclasses :return: None
-
files
= {'10090': {'columnns': ['DB', 'DB_Object_ID', 'DB_Object_Symbol', 'Qualifier', 'GO_ID', 'DB:Reference', 'Evidence Code', 'With (or) From', 'Aspect', 'DB_Object_Name', 'DB_Object_Synonym', 'DB_Object_Type', 'Taxon and Interacting taxon', 'Date', 'Assigned_By', 'Annotation_Extension', 'Gene_Product_Form_ID'], 'file': 'mgi.gaf.gz', 'url': 'http://current.geneontology.org/annotations/mgi.gaf.gz'}, '10116': {'columnns': ['DB', 'DB_Object_ID', 'DB_Object_Symbol', 'Qualifier', 'GO_ID', 'DB:Reference', 'Evidence Code', 'With (or) From', 'Aspect', 'DB_Object_Name', 'DB_Object_Synonym', 'DB_Object_Type', 'Taxon and Interacting taxon', 'Date', 'Assigned_By', 'Annotation_Extension', 'Gene_Product_Form_ID'], 'file': 'rgd.gaf.gz', 'url': 'http://current.geneontology.org/annotations/rgd.gaf.gz'}, '4896': {'columnns': ['DB', 'DB_Object_ID', 'DB_Object_Symbol', 'Qualifier', 'GO_ID', 'DB:Reference', 'Evidence Code', 'With (or) From', 'Aspect', 'DB_Object_Name', 'DB_Object_Synonym', 'DB_Object_Type', 'Taxon and Interacting taxon', 'Date', 'Assigned_By', 'Annotation_Extension', 'Gene_Product_Form_ID'], 'file': 'pombase.gaf.gz', 'url': 'http://current.geneontology.org/annotations/pombase.gaf.gz'}, '5052': {'columnns': ['DB', 'DB_Object_ID', 'DB_Object_Symbol', 'Qualifier', 'GO_ID', 'DB:Reference', 'Evidence Code', 'With (or) From', 'Aspect', 'DB_Object_Name', 'DB_Object_Synonym', 'DB_Object_Type', 'Taxon and Interacting taxon', 'Date', 'Assigned_By', 'Annotation_Extension', 'Gene_Product_Form_ID'], 'file': 'aspgd.gaf.gz', 'url': 'http://current.geneontology.org/annotations/aspgd.gaf.gz'}, '559292': {'columnns': ['DB', 'DB_Object_ID', 'DB_Object_Symbol', 'Qualifier', 'GO_ID', 'DB:Reference', 'Evidence Code', 'With (or) From', 'Aspect', 'DB_Object_Name', 'DB_Object_Synonym', 'DB_Object_Type', 'Taxon and Interacting taxon', 'Date', 'Assigned_By', 'Annotation_Extension', 'Gene_Product_Form_ID'], 'file': 'sgd.gaf.gz', 'url': 'http://current.geneontology.org/annotations/sgd.gaf.gz'}, '5782': {'columnns': ['DB', 'DB_Object_ID', 'DB_Object_Symbol', 'Qualifier', 'GO_ID', 'DB:Reference', 'Evidence Code', 'With (or) From', 'Aspect', 'DB_Object_Name', 'DB_Object_Synonym', 'DB_Object_Type', 'Taxon and Interacting taxon', 'Date', 'Assigned_By', 'Annotation_Extension', 'Gene_Product_Form_ID'], 'file': 'dictybase.gaf.gz', 'url': 'http://current.geneontology.org/annotations/dictybase.gaf.gz'}, '6239': {'columnns': ['DB', 'DB_Object_ID', 'DB_Object_Symbol', 'Qualifier', 'GO_ID', 'DB:Reference', 'Evidence Code', 'With (or) From', 'Aspect', 'DB_Object_Name', 'DB_Object_Synonym', 'DB_Object_Type', 'Taxon and Interacting taxon', 'Date', 'Assigned_By', 'Annotation_Extension', 'Gene_Product_Form_ID'], 'file': 'wb.gaf.gz', 'url': 'http://current.geneontology.org/annotations/wb.gaf.gz'}, '7227': {'columnns': ['DB', 'DB_Object_ID', 'DB_Object_Symbol', 'Qualifier', 'GO_ID', 'DB:Reference', 'Evidence Code', 'With (or) From', 'Aspect', 'DB_Object_Name', 'DB_Object_Synonym', 'DB_Object_Type', 'Taxon and Interacting taxon', 'Date', 'Assigned_By', 'Annotation_Extension', 'Gene_Product_Form_ID'], 'file': 'fb.gaf.gz', 'url': 'http://current.geneontology.org/annotations/fb.gaf.gz'}, '7955': {'columnns': ['DB', 'DB_Object_ID', 'DB_Object_Symbol', 'Qualifier', 'GO_ID', 'DB:Reference', 'Evidence Code', 'With (or) From', 'Aspect', 'DB_Object_Name', 'DB_Object_Synonym', 'DB_Object_Type', 'Taxon and Interacting taxon', 'Date', 'Assigned_By', 'Annotation_Extension', 'Gene_Product_Form_ID'], 'file': 'zfin.gaf.gz', 'url': 'http://current.geneontology.org/annotations/zfin.gaf.gz'}, '9031': {'columnns': ['DB', 'DB_Object_ID', 'DB_Object_Symbol', 'Qualifier', 'GO_ID', 'DB:Reference', 'Evidence Code', 'With (or) From', 'Aspect', 'DB_Object_Name', 'DB_Object_Synonym', 'DB_Object_Type', 'Taxon and Interacting taxon', 'Date', 'Assigned_By', 'Annotation_Extension', 'Gene_Product_Form_ID'], 'file': 'goa_chicken.gaf.gz', 'url': 'http://current.geneontology.org/annotations/goa_chicken.gaf.gz'}, '9606': {'columnns': ['DB', 'DB_Object_ID', 'DB_Object_Symbol', 'Qualifier', 'GO_ID', 'DB:Reference', 'Evidence Code', 'With (or) From', 'Aspect', 'DB_Object_Name', 'DB_Object_Synonym', 'DB_Object_Type', 'Taxon and Interacting taxon', 'Date', 'Assigned_By', 'Annotation_Extension', 'Gene_Product_Form_ID'], 'file': 'goa_human.gaf.gz', 'url': 'http://current.geneontology.org/annotations/goa_human.gaf.gz'}, '9615': {'columnns': ['DB', 'DB_Object_ID', 'DB_Object_Symbol', 'Qualifier', 'GO_ID', 'DB:Reference', 'Evidence Code', 'With (or) From', 'Aspect', 'DB_Object_Name', 'DB_Object_Synonym', 'DB_Object_Type', 'Taxon and Interacting taxon', 'Date', 'Assigned_By', 'Annotation_Extension', 'Gene_Product_Form_ID'], 'file': 'goa_dog.gaf.gz', 'url': 'http://current.geneontology.org/annotations/goa_dog.gaf.gz'}, '9823': {'columnns': ['DB', 'DB_Object_ID', 'DB_Object_Symbol', 'Qualifier', 'GO_ID', 'DB:Reference', 'Evidence Code', 'With (or) From', 'Aspect', 'DB_Object_Name', 'DB_Object_Synonym', 'DB_Object_Type', 'Taxon and Interacting taxon', 'Date', 'Assigned_By', 'Annotation_Extension', 'Gene_Product_Form_ID'], 'file': 'goa_pig.gaf.gz', 'url': 'http://current.geneontology.org/annotations/goa_pig.gaf.gz'}, '9913': {'columnns': ['DB', 'DB_Object_ID', 'DB_Object_Symbol', 'Qualifier', 'GO_ID', 'DB:Reference', 'Evidence Code', 'With (or) From', 'Aspect', 'DB_Object_Name', 'DB_Object_Synonym', 'DB_Object_Type', 'Taxon and Interacting taxon', 'Date', 'Assigned_By', 'Annotation_Extension', 'Gene_Product_Form_ID'], 'file': 'goa_cow.gaf.gz', 'url': 'http://current.geneontology.org/annotations/goa_cow.gaf.gz'}, 'gaf-eco-mapping': {'file': 'gaf-eco-mapping.yaml', 'url': 'https://archive.monarchinitiative.org/DipperCache/go/gaf-eco-mapping.yaml'}, 'idmapping_selected': {'columns': ['UniProtKB-AC', 'UniProtKB-ID', 'GeneID (EntrezGene)', 'RefSeq', 'GI', 'PDB', 'GO', 'UniRef100', 'UniRef90', 'UniRef50', 'UniParc', 'PIR', 'NCBI-taxon', 'MIM', 'UniGene', 'PubMed', 'EMBL', 'EMBL-CDS', 'Ensembl', 'Ensembl_TRS', 'Ensembl_PRO', 'Additional PubMed'], 'file': 'idmapping_selected.tab.gz', 'url': 'ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/idmapping/idmapping_selected.tab.gz'}}¶
-
gaf_columns
= ['DB', 'DB_Object_ID', 'DB_Object_Symbol', 'Qualifier', 'GO_ID', 'DB:Reference', 'Evidence Code', 'With (or) From', 'Aspect', 'DB_Object_Name', 'DB_Object_Synonym', 'DB_Object_Type', 'Taxon and Interacting taxon', 'Date', 'Assigned_By', 'Annotation_Extension', 'Gene_Product_Form_ID']¶
-
getTestSuite
()¶ An abstract method that should be overwritten with tests appropriate for the specific source. :return:
-
get_uniprot_entrez_id_map
()¶
-
parse
(limit=None)¶ abstract method to parse all data from an external resource, that was fetched in fetch() this should be overridden by subclasses :return: None
-
process_gaf
(gaffile, limit, id_map=None)¶
-
wont_prefix
= ['zgc', 'wu', 'si', 'im', 'BcDNA', 'sb', 'anon-EST', 'EG', 'id', 'zmp', 'BEST', 'BG', 'hm', 'tRNA', 'NEST', 'xx']¶
-