dipper.sources.Ensembl module

class dipper.sources.Ensembl.Ensembl(graph_type, are_bnodes_skolemized, data_release_version=None, tax_ids=None, gene_ids=None)

Bases: dipper.sources.Source.Source

This is the processing module for Ensembl.

It only includes methods to acquire the equivalences between NCBIGene and ENSG ids using ENSEMBL’s Biomart services.

columns = {'bmq_attributes': ['ensembl_gene_id', 'external_gene_name', 'description', 'gene_biotype', 'entrezgene_id', 'ensembl_peptide_id', 'uniprotswissprot', 'hgnc_id'], 'bmq_headers': ['Gene stable ID', 'Gene name', 'Gene description', 'Gene type', 'NCBI gene (formerly Entrezgene) ID', 'Protein stable ID', 'UniProtKB/Swiss-Prot ID', 'HGNC ID']}
fetch(is_dl_forced=True)

abstract method to fetch all data from an external resource. this should be overridden by subclasses :return: None

fetch_protein_gene_map(taxon_id)

Fetch a mapping from proteins to ensembl_gene(S)? for a species in biomart :param taxid: :return: dict

fetch_uniprot_gene_map(taxon_id)

Fetch a dict of uniprot-gene for a species in biomart :param taxid: :return: dict

files = {'10090': {'file': 'ensembl_10090.txt'}, '10116': {'file': 'ensembl_10116.txt'}, '13616': {'file': 'ensembl_13616.txt'}, '28377': {'file': 'ensembl_28377.txt'}, '31033': {'file': 'ensembl_31033.txt'}, '3702': {'file': 'ensembl_3702.txt'}, '44689': {'file': 'ensembl_44689.txt'}, '4896': {'file': 'ensembl_4896.txt'}, '4932': {'file': 'ensembl_4932.txt'}, '6239': {'file': 'ensembl_6239.txt'}, '7227': {'file': 'ensembl_7227.txt'}, '7955': {'file': 'ensembl_7955.txt'}, '8364': {'file': 'ensembl_8364.txt'}, '9031': {'file': 'ensembl_9031.txt'}, '9258': {'file': 'ensembl_9258.txt'}, '9544': {'file': 'ensembl_9544.txt'}, '9606': {'file': 'ensembl_9606.txt'}, '9615': {'file': 'ensembl_9615.txt'}, '9796': {'file': 'ensembl_9796.txt'}, '9823': {'file': 'ensembl_9823.txt'}, '9913': {'file': 'ensembl_9913.txt'}}
getTestSuite()

An abstract method that should be overwritten with tests appropriate for the specific source. :return:

parse(limit=None)

abstract method to parse all data from an external resource, that was fetched in fetch() this should be overridden by subclasses :return: None