dipper.sources.UDP module¶
-
class
dipper.sources.UDP.
UDP
(graph_type, are_bnodes_skolemized, data_release_version=None)¶ Bases:
dipper.sources.Source.Source
The National Institutes of Health (NIH) Undiagnosed Diseases Program (UDP) is part of the Undiagnosed Disease Network (UDN), an NIH Common Fund initiative that focuses on the most puzzling medical cases referred to the NIH Clinical Center in Bethesda, Maryland. from https://www.genome.gov/27544402/the-undiagnosed-diseases-program/
Data is available by request for access via the NHGRI collaboration server: https://udplims-collab.nhgri.nih.gov/api
Note the fetcher requires credentials for the UDP collaboration server Credentials are added via a config file, config.json, in the following format {
- “dbauth” : {
- “udp”: {
- “user”: “foo” “password”: “bar”
}
} See dipper/config.py for more information
Output of fetcher: udp_variants.tsv ‘Patient’, ‘Family’, ‘Chr’, ‘Build’, ‘Chromosome Position’, ‘Reference Allele’, ‘Variant Allele’, ‘Parent of origin’, ‘Allele Type’, ‘Mutation Type’, ‘Gene’, ‘Transcript’, ‘Original Amino Acid’, ‘Variant Amino Acid’, ‘Amino Acid Change’, ‘Segregates with’, ‘Position’, ‘Exon’, ‘Inheritance model’, ‘Zygosity’, ‘dbSNP ID’, ‘1K Frequency’, ‘Number of Alleles’
udp_phenotypes.tsv ‘Patient’, ‘HPID’, ‘Present’
The script also utilizes two mapping files udp_gene_map.tsv - generated from scripts/fetch-gene-ids.py,
gene symbols from udp_variants- udp_chr_rs.tsv - rsid(s) per coordinate greped from hg19 dbsnp file,
- then disambiguated with eutils, see scripts/dbsnp/dbsnp.py
-
UDP_SERVER
= 'https://udplims-collab.nhgri.nih.gov/api'¶
-
fetch
(is_dl_forced=True)¶ Fetches data from udp collaboration server, see top level comments for class for more information :return:
-
files
= {'patient_phenotypes': {'file': 'udp_phenotypes.tsv'}, 'patient_variants': {'file': 'udp_variants.tsv'}}¶
-
map_files
= {'dbsnp_map': '../../resources/udp/udp_chr_rs.tsv', 'gene_coord_map': '../../resources/udp/gene_coordinates.tsv', 'patient_ids': '../../resources/udp/patient_ids.yaml'}¶
-
parse
(limit=None)¶ Override Source.parse() Args:
:param limit (int, optional) limit the number of rows processed- Returns:
- :return None