dipper.sources.CTD module

class dipper.sources.CTD.CTD(graph_type, are_bnodes_skolemized, data_release_version=None)

Bases: dipper.sources.Source.Source

The Comparative Toxicogenomics Database (CTD) includes curated data describing cross-species chemical–gene/protein interactions and chemical– and gene–disease associations to illuminate molecular mechanisms underlying variable susceptibility and environmentally influenced diseases. (updated monthly).

Here, we fetch, parse, and convert data from CTD into triples, leveraging only the associations based on DIRECT evidence (not using the inferred associations). We currently process the following associations: * chemical-disease * gene-pathway * gene-disease

CTD curates relationships between genes and chemicals/diseases with ‘marker/mechanism’ or ‘therapeutic’. (observe strictly OR) Unfortunately, we cannot disambiguate between marker (gene expression) and mechanism (causation) for these associations. Therefore, we are left to relate these simply by “marker”.

# We DISCONTIUED at some point prior to 202005 # CTD also pulls in genes and pathway membership from KEGG and REACTOME. # We create groups of these following the pattern that the specific pathway # is a subclass of ‘cellular process’ (a go process), and the gene is # “involved in” that process.

For diseases, we preferentially use OMIM identifiers when they can be used uniquely over MESH. Otherwise, we use MESH ids.

Note that we scrub the following identifiers and their associated data: * REACT:REACT_116125 - generic disease class * MESH:D004283 - dog diseases * MESH:D004195 - disease models, animal * MESH:D030342 - genetic diseases, inborn * MESH:D040181 - genetic dieases, x-linked * MESH:D020022 - genetic predisposition to a disease

fetch(is_dl_forced=False)

Override Source.fetch() Fetches resources from CTD using the CTD.files dictionary Args: :param is_dl_forced (bool): Force download Returns: :return None

files = {'chemical_disease_associations': {'columns': ['ChemicalName', 'ChemicalID', 'CasRN', 'DiseaseName', 'DiseaseID', 'DirectEvidence', 'InferenceGeneSymbol', 'InferenceScore', 'OmimIDs', 'PubMedIDs'], 'file': 'CTD_chemicals_diseases.tsv.gz', 'url': 'http://ctdbase.org/reports/CTD_chemicals_diseases.tsv.gz'}}
getTestSuite()

An abstract method that should be overwritten with tests appropriate for the specific source. :return:

parse(limit=None)

Override Source.parse() Parses version and interaction information from CTD Args: :param limit (int, optional) limit the number of rows processed Returns: :return None