From ZENBU documentation wiki
Jump to: navigation, search

Data Stream Processing > Processing Modules > Metadata manipulation


The OverlapAnnotate processing module takes a side-stream of features and performs overlap comparison against features on the primary data stream. When an overlap occurs, the metadata of the side-stream feature is copied into the primary-stream feature to annotate it. Parameters of the module control the overlap behavior and which metadata is transfered.


  • <side_stream> : data source definition for features to be used as the templates for overlap comparison. This is most often a Proxy for another set of defined annotations, but it can also be a processed stream of features. Proxy datastream should be configure with an output of full_feature in ensure Metadata is present on the features.
  • <overlap_mode> : defines how overlap calculation is performed between features on the primary stream and features on the side stream. possible values are :
    • area : the full length of both features are used in the comparison.
    • 5end : the primary stream feature is compressed to the 5' end and overlap is compared against that single base location.
    • 3end : the primary stream feature is compressed to the 3' end and overlap is compared against that single base location.
  • <ignore_strand> : ignore strand specificity when comparing features between the primary and template streams. Enable by setting to true.
  • <distance> : features are allowed to be up to distance basepairs away from each other and still be considered to overlap.
  • <overlap_subfeatures> : if features contain subfeatures (eg like transcript gene models) setting this option to true will require that the subfeatures overlap each other. If one of the features does not have subfeatures then the genomic bounds of the feature are used in the overlap calulation. If both features have subfeatures then it must be a subfeature to subfeature overlap.
  • <mdata_mode> : defines which set of Metadata to transfer from. possible values are :
    • feature : only transfers metadata directly attached to the side-stream feature.
    • featuresource : only transfers metadata from the FeatureSource of overlapping side-stream features.
    • experiment : only transfers metadata from the Experiments expressed by overlapping side-stream features. All Experiments are processed.
    • all : transfers Metadata from all the above

To specify that only certain types of metadata can be transfered, one or more of the following can be added into the modules XML specification.

  • <mdata type="some-type">some-value</mdata> : specifies a specific metadata of matching type and value to be transfered if present in the side-stream feature.
  • <mdata type="some-type"></mdata> : specifies that any metadata matching the type will be transfered.
  • <mdata>some-value</mdata> : specifies that any metadata matching the value will be transfered.

If no <mdata> tags are specified then All metadata will be transfered.


TODO ...