Configuring Tracks

From ZENBU documentation wiki
Jump to: navigation, search

New tracks can be created in the ZENBU system either through the genome browser Create-new-track.jpg control or through the Data explorer interfaces.

In all cases the interfaces are similar and are based around the same concept of

  1. choosing a collection (one or more) primary data sources to be the dynamically pooled virtual DataSource of the track
  2. configuring the optional data processing for the track
  3. configuring the visualization options

Here is the new track configuration interface panel from the genome browser and the reconfigure track interface panel. The main difference is that new-track-configuration panel allows for the configuration of the track data sources.

Create new track panel.jpg


In the Data Explorer interface, data sources are selected first prior to launching the Configure Track interface panel.

Selecting track data sources

When configuring a track, the first step is to select which data you wish to work with. This can be one or more data sources which forms the track's virtual pooled Data Source. Since ZENBU is a dynamic system with many users uploading and sharing data, the process of finding data sources is via a google style interface for searching the metadata of the Data Sources in the system.

New track data search.jpg

To select Data Sources for a new track, simply enter searches, and check the Data Sources you wish to use. There are several options to help with searching including

  • restrict search to current genome/assembly : unless you are planning an advanced scripted data processing operation, in general only data mapped to the current genome is most useful.
  • data source type : by default it searches both annotation FeatureSources and expression Experiments. This control allows the user to restrict the search only one of the two types of Data Sources.
  • collaborative project : selects collaboration based filtering to help narrow the search. Shows only data which has been shared with selected collaboration.
  • refresh will clear the search panel of all source which have not been check selected
  • clear : will clear the search and unselect all sources so that the user can start over
  • search: perform the keyword logic search

After selecting the Data Source which be pooled into the virtual merged Data Source for the track, the details of the data stream needs to be configured

  • feature mode: this defines the level of data extracted out of the ZENBU database systems. The more data extracted the fatter the features are and the slower the database extraction process. Please select an appropriate level for performance tuning. If uncertain leave select at full_feature since this will ensure all data of the Features are available for dataprocessing and visualization. options include
    • full_feature : all data associated with the Feature : genomic coordinates, subfeatures, expression and metadata
    • simple_feature : only the primary genomic coordinates of the Feature are extracted
    • subfeature : primary genomic coordinates and subfeatures. no expression or metadata are extracted for use on the data stream
    • expression : primary genomic coordinates and expression. no subfeatures or metadata are extracted for use on the data stream
    • skip_metadata : primary genomic coordinates, subfeatures, and expression. no metadata is extracted for use on the data stream
    • skip_expression : primary genomic coordinates, subfeatures, and metadata. no expression is extracted for use on the data stream
  • data source type: all signal-based data is tagged with a datatype and within an Experiment there can be multiple different datatypes. This pulldown will display the available datatypes within the pool of selected DataSources. Please select the appropriate datatype for your track.

Configure data processing

The configuration of track data processing falls into 4 categories

  • none : no additional data processing is needed and the virtual pooled data stream in left unchanged and simply streamed into the visualization.
  • predefined script: allows the user to select a previously saved and shared data processing script made by another user. This option enables novice users to perform complex data processing with simple push button ease.
  • expression binning gui: this is a simplified user interface which allows users to manipulate their expression data into expression signal data. This performs genomic segmentation binning and collation of expression into the genomic bins to get expression signal at a genome level
  • custom XML scripting: the most advanced option where users write their own data processing scripts using the ZENBU data processing XML language by chaining together modules and datastreams.

Selecting a predefined script

Simply search the existing predefined scripts which have been saved and shared by others users and choose one by clicking on its script name. Searching utilizes the metadata searching system to help narrow down the options. The user also has the option to show all and scroll through the list of predefined scripts currently in the system.

Configure predefined script.png

After a script has been selected and loaded into the track, the name and description are displayed along with the unique UUID of the saved script configuration.

Configure predefined script selected.png

If the script included some track_defaults those options will also be toggled on the panel to their new default-state by the script-loading. If one wants to change the predefined script one can push the replace button which brings back the search interface. If one wants to see the details of the script XML or modify it, one can push the edit script which will copy the contents of the predefined script into the custom XML scripting editing interface.

Expression binning GUI

In order to utilize the Visualization styles of express or spectrum or experiment-heatmap expression data needs to be binned-and-collated into a genomic-segmentation-grid. This data processing interface provides a simplified user-interface to perform this expression-binning.

Configure expression binning gui.png

For details on the options in this panel and how they effect expression visualization, please refer to the Expression Tracks section of the Track Visualization section of the documentation.

Custom XML scripting

The most advanced option for configuring of data processing in a track is to directly write in the ZENBU script XML language. In this example we show the script XML from the "GencodeV10 expression - stranded protocols, RPKM normalized" predefined script.

Configure custom XML script.png

For details on how to write ZENBU script XML data processing and all the available processing modules, please refer to the Data processing user guide section of the documentation.

Configure track visualization

The Tracks in the ZENBU genome browser fall into three main categories of visualization styles

  • Annotation tracks: where the data sources only contain genomic information and no expression
  • Expression tracks: where expression level is displayed without feature boundaries in a style similar to the UCSC genome browser 'wiggle' but in an user interactive tool.
    LongRNASeq CSH exonic expression.png
  • Hybrid tracks: ZENBU enhanced visualization which allow for processed data to contain both genomic features and multi-experiment expression data
    Hybrid tracks longRNAseq.png

For details of the different visualization styles and configuration options, please refer to the Track visualization styles section of the documentation.

Track Set mode

Track Set mode - an easy interface to configure multiple tracks from same data

Since ZENBU has the ability to apply different processing to the same source data and generate different visualization tracks, we have created an "easy" interface for new users to quickly get started with ZENBU. This feature is called Track-Set mode.

To switch into Track-Set mode, start to configure a new track as above, but click the track-set mode submenu and you will be presented with an alternate and simplified interface Trackset config panel.png

In this interface one only needs to do two simple steps - choosing a collection (one or more) primary data sources to be the dynamically pooled virtual DataSource of the track - choose the track-set type

We currently have predefined track-sets for: RNAseq, CAGE, and shortRNA, but will add more in the future.

Accepting the configuration will generate several tracks using the same source data. In the example above with 63 different Encode shortRNA experiments pooled together and the ShortRNA track-set mode selected, the tool will generate these 8 different tracks.

Trackset config encode shortRNA.png