dipper.sources.UDP module

class dipper.sources.UDP.UDP(graph_type, are_bnodes_skolemized, data_release_version=None)

Bases: dipper.sources.Source.Source

The National Institutes of Health (NIH) Undiagnosed Diseases Program (UDP) is part of the Undiagnosed Disease Network (UDN), an NIH Common Fund initiative that focuses on the most puzzling medical cases referred to the NIH Clinical Center in Bethesda, Maryland. from https://www.genome.gov/27544402/the-undiagnosed-diseases-program/

Data is available by request for access via the NHGRI collaboration server: https://udplims-collab.nhgri.nih.gov/api

Note the fetcher requires credentials for the UDP collaboration server Credentials are added via a config file, config.json, in the following format {

“dbauth” : {
“udp”: {
“user”: “foo” “password”: “bar”

}

} See dipper/config.py for more information

Output of fetcher: udp_variants.tsv ‘Patient’, ‘Family’, ‘Chr’, ‘Build’, ‘Chromosome Position’, ‘Reference Allele’, ‘Variant Allele’, ‘Parent of origin’, ‘Allele Type’, ‘Mutation Type’, ‘Gene’, ‘Transcript’, ‘Original Amino Acid’, ‘Variant Amino Acid’, ‘Amino Acid Change’, ‘Segregates with’, ‘Position’, ‘Exon’, ‘Inheritance model’, ‘Zygosity’, ‘dbSNP ID’, ‘1K Frequency’, ‘Number of Alleles’

udp_phenotypes.tsv ‘Patient’, ‘HPID’, ‘Present’

The script also utilizes two mapping files udp_gene_map.tsv - generated from scripts/fetch-gene-ids.py,

gene symbols from udp_variants
udp_chr_rs.tsv - rsid(s) per coordinate greped from hg19 dbsnp file,
then disambiguated with eutils, see scripts/dbsnp/dbsnp.py
UDP_SERVER = 'https://udplims-collab.nhgri.nih.gov/api'
fetch(is_dl_forced=True)

Fetches data from udp collaboration server, see top level comments for class for more information :return:

files = {'patient_phenotypes': {'file': 'udp_phenotypes.tsv'}, 'patient_variants': {'file': 'udp_variants.tsv'}}
map_files = {'dbsnp_map': '../../resources/udp/udp_chr_rs.tsv', 'gene_coord_map': '../../resources/udp/gene_coordinates.tsv', 'patient_ids': '../../resources/udp/patient_ids.yaml'}
parse(limit=None)

Override Source.parse() Args:

:param limit (int, optional) limit the number of rows processed
Returns:
:return None