dipper.sources.GWASCatalog module

class dipper.sources.GWASCatalog.GWASCatalog(graph_type, are_bnodes_skolemized, data_release_version=None)

Bases: dipper.sources.Source.Source

The NHGRI-EBI Catalog of published genome-wide association studies.

We link the variants recorded here to the curated EFO-classes using a “contributes to” linkage because the only thing we know is that the SNPs are associated with the trait/disease, but we don’t know if it is actually causative.

Description of the GWAS catalog is here: http://www.ebi.ac.uk/gwas/docs/fileheaders#_file_headers_for_catalog_version_1_0_1

GWAS also pulishes Owl files described here http://www.ebi.ac.uk/gwas/docs/ontology

Status: IN PROGRESS

GWASFILE = 'gwas-catalog-associations_ontology-annotated.tsv'
GWASFTP = 'ftp://ftp.ebi.ac.uk/pub/databases/gwas/releases/latest/'
fetch(is_dl_forced=False)
Parameters:is_dl_forced
Returns:
files = {'catalog': {'columns': ['DATE ADDED TO CATALOG', 'PUBMEDID', 'FIRST AUTHOR', 'DATE', 'JOURNAL', 'LINK', 'STUDY', 'DISEASE/TRAIT', 'INITIAL SAMPLE SIZE', 'REPLICATION SAMPLE SIZE', 'REGION', 'CHR_ID', 'CHR_POS', 'REPORTED GENE(S)', 'MAPPED_GENE', 'UPSTREAM_GENE_ID', 'DOWNSTREAM_GENE_ID', 'SNP_GENE_IDS', 'UPSTREAM_GENE_DISTANCE', 'DOWNSTREAM_GENE_DISTANCE', 'STRONGEST SNP-RISK ALLELE', 'SNPS', 'MERGED', 'SNP_ID_CURRENT', 'CONTEXT', 'INTERGENIC', 'RISK ALLELE FREQUENCY', 'P-VALUE', 'PVALUE_MLOG', 'P-VALUE (TEXT)', 'OR or BETA', '95% CI (TEXT)', 'PLATFORM [SNPS PASSING QC]', 'CNV', 'MAPPED_TRAIT', 'MAPPED_TRAIT_URI', 'STUDY ACCESSION', 'GENOTYPING TECHNOLOGY'], 'file': 'gwas-catalog-associations_ontology-annotated.tsv', 'url': 'ftp://ftp.ebi.ac.uk/pub/databases/gwas/releases/latest/gwas-catalog-associations_ontology-annotated.tsv'}, 'mondo': {'file': 'mondo.json', 'url': 'https://github.com/monarch-initiative/mondo/releases/download/2019-04-06/mondo-minimal.json'}, 'so': {'file': 'so.owl', 'url': 'http://purl.obolibrary.org/obo/so.owl'}}
parse(limit=None)

abstract method to parse all data from an external resource, that was fetched in fetch() this should be overridden by subclasses :return: None

process_catalog(limit=None)
Parameters:limit
Returns: