dipper.sources.GeneReviews module

class dipper.sources.GeneReviews.GeneReviews(graph_type, are_bnodes_skolemized)

Bases: dipper.sources.Source.Source

Here we process the GeneReviews mappings to OMIM, plus inspect the GeneReviews (html) books to pull the clinical descriptions in order to populate the definitions of the terms in the ontology. We define the GeneReviews items as classes that are either grouping classes over OMIM disease ids (gene ids are filtered out), or are made as subclasses of DOID:4 (generic disease).

Note that GeneReviews [copyright policy](http://www.ncbi.nlm.nih.gov/books/NBK138602/) (as of 2015.11.20) says:

GeneReviews® chapters are owned by the University of Washington, Seattle, © 1993-2015. Permission is hereby granted to reproduce, distribute, and translate copies of content materials provided that (i) credit for source (www.ncbi.nlm.nih.gov/books/NBK1116/) and copyright (University of Washington, Seattle) are included with each copy; (ii) a link to the original material is provided whenever the material is published elsewhere on the Web; and (iii) reproducers, distributors, and/or translators comply with this copyright notice and the GeneReviews Usage Disclaimer.

This script doesn’t pull the GeneReviews books from the NCBI Bookshelf directly; scripting this task is expressly prohibited by [NCBIBookshelf policy](http://www.ncbi.nlm.nih.gov/books/NBK45311/). However, assuming you have acquired the books (in html format) via permissible means, a parser for those books is provided here to extract the clinical descriptions to define the NBK identified classes.

create_books()
fetch(is_dl_forced=False)

We fetch GeneReviews id-label map and id-omim mapping files from NCBI. :return: None

files = {'idmap': {'url': 'http://ftp.ncbi.nih.gov/pub/GeneReviews/NBKid_shortname_OMIM.txt', 'file': 'NBKid_shortname_OMIM.txt'}, 'titles': {'url': 'http://ftp.ncbi.nih.gov/pub/GeneReviews/GRtitle_shortname_NBKid.txt', 'file': 'GRtitle_shortname_NBKid.txt'}}
getTestSuite()

An abstract method that should be overwritten with tests appropriate for the specific source. :return:

parse(limit=None)
Returns:None
process_nbk_html(limit)

Here we process the gene reviews books to fetch the clinical descriptions to include in the ontology. We only use books that have been acquired manually, as NCBI Bookshelf does not permit automated downloads. This parser will only process the books that are found in the `raw/genereviews/books` directory, permitting partial completion.

Parameters:limit
Returns: