dipper.models.GenomicFeature module

class dipper.models.GenomicFeature.Feature(graph, feature_id=None, label=None, feature_type=None, description=None, feature_category=None)

Bases: object

Dealing with genomic features here. By default they are all faldo:Regions. We use SO for typing genomic features. At the moment, RO:has_subsequence is the default relationship between the regions, but this should be tested/verified.

TODO: the graph additions are in the addXToFeature functions, but should be separated. TODO: this will need to be extended to properly deal with fuzzy positions in faldo.

addFeatureEndLocation(coordinate, reference_id, strand=None, position_types=None)

Adds the coordinate details for the end of this feature :param coordinate: :param reference_id: :param strand:

addFeatureProperty(property_type, feature_property)
addFeatureStartLocation(coordinate, reference_id, strand=None, position_types=None)

Adds coordinate details for the start of this feature. :param coordinate: :param reference_id: :param strand: :param position_types:

addFeatureToGraph(add_region=True, region_id=None, feature_as_class=False, feature_category=None)

We make the assumption here that all features are instances. The features are located on a region, which begins and ends with faldo:Position The feature locations leverage the Faldo model, which has a general structure like: Triples: feature_id a feature_type (individual) faldo:location region_id region_id a faldo:region faldo:begin start_position faldo:end end_position start_position a (any of: faldo:(((Both|Plus|Minus)Strand)|Exact)Position) faldo:position Integer(numeric position) faldo:reference reference_id end_position a (any of: faldo:(((Both|Plus|Minus)Strand)|Exact)Position) faldo:position Integer(numeric position) faldo:reference reference_id

:param add_region [True] :param region_id [None] :param feature_as_class [False] :param feature_category: a biolink category CURIE for feature

addPositionToGraph(reference_id, position, position_types=None, strand=None)

Add the positional information to the graph, following the faldo model. We assume that if the strand is None, we give it a generic “Position” only. Triples: my_position a (any of: faldo:(((Both|Plus|Minus)Strand)|Exact)Position) faldo:position Integer(numeric position) faldo:reference reference_id

Parameters:
  • graph
  • reference_id
  • position
  • position_types
  • strand
Returns:

Identifier of the position created

addRegionPositionToGraph(region_id, begin_position_id, end_position_id)
addSubsequenceOfFeature(parentid, subject_category=None, object_category=None)

This will add reciprocal triples like: feature <is subsequence of> parent parent has_subsequence feature :param graph: :param parentid:

Returns:
addTaxonToFeature(taxonid)

Given the taxon id, this will add the following triple: feature in_taxon taxonid :param graph: :param taxonid: :return:

dipper.models.GenomicFeature.makeChromID(chrom, reference=None, prefix=None)

This will take a chromosome number and a NCBI taxon number, and create a unique identifier for the chromosome. These identifiers are made in the @base space like: Homo sapiens (9606) chr1 ==> :9606chr1 Mus musculus (10090) chrX ==> :10090chrX

Parameters:
  • chrom – the chromosome (preferably without any chr prefix)
  • reference – the numeric portion of the taxon id
Returns:

dipper.models.GenomicFeature.makeChromLabel(chrom, reference=None)