ZENBU genome browser

From ZENBU documentation wiki
Jump to: navigation, search

The ZENBU genome browser is designed to provide an advanced tool for not only visualizing genomics data, but also for interactively processing the data. While most genome browsers provide static visualization images, data in the ZENBU system is dynamically manipulatable by the users of the system.

ZENBU genome browser CSHL RNAseq fig1.png

Views are collections of tracks

The ZENBU genome browser is launched with a previously saved view configuration. A view is simply a collection of tracks with a title and description. This can be considered like an interactive figure of a paper. Any user who is loggged into the system can save and share views with other users. This is done through the save view Save view button.jpg button at the top of the genome browser.

Users can manipulate the order of tracks in their views through "click and drag" of the side-bar or title-bar.

Each view receives a unique URL for sharing with your collaborators via email or for bookmarking. For example this URL links to the ZENBU genome browser view shown above

The system will also automatically save temporary views as the user makes changes. This autosave enables the use of the web browser "back arrow" to undo changes. Although these temporary views are in the system with unique URLs, they are not permanent and will be deleted after a period of time. To ensure your view is permanent and can be re-accessed in the future, users should login and explicitly save the view.

Multiple views on the same data

Unlike other genome browsers, ZENBU does not need to link the track visualization to specific upload file formats. This means with ZENBU, the user can upload their data once and then manipulate it in ZENBU to create many views with the same data. Where other genome browsers like UCSC or IGV require many different visualization file formats to support each different visualization track style, ZENBU can focus on a few common data interchange file formats (BAM, BED, GTF/GFF, OSCTable). This makes ZENBU more like a bioinformatics processing tool rather than just a visualization tool (genome browser).

Because the data is loaded once and visualization/processing is performed by ZENBU, the data content of all tracks is guaranteed to be synchronized and one can see exactly how each track was made by looking at the ZENBU data processing scripting in each track. This makes the tracks of ZENBU "data transparent"

In this example 47 RNAseq experiments were loaded once from 47 BAM files and then dynamically pooled and processed by ZENBU to create 7 different visualization tracks: 1) RNAseq expression collated into Gencode genes, 2) RNAseq exonic-only expression signal strength 3) alignments processed into "observed intron" evidence 4) intron evidence processed into splice acceptor sites 5) intron evidence processed into splice donor sites 6) RNAseq exonic signal as a single spectrum 7) RNAseq exonic signal as experiment-heatmap showing both genome position and experimental differential expression.

Wold RNAseq many views same data.png

For many more examples of how ZENBU can process and visualize the same uploaded data into many different visualizations, please check out both the data processing sections and the experimental case studies sections of the documentation.

Tracks for data visualization and processing

The main aspect of the ZENBU genome browser is the concept of a track. The tracks in ZENBU are completely user definable and encapsulate the concepts of

  1. dynamic data source pooling to create a track virtual DataSource by choosing a collection (one or more) of primary data sources
  2. data processing
  3. data visualization

Data pooling

The ZENBU system allows for the dynamic creation of merged data sets referred to as data stream pools. This provides for a great deal of flexibility when configuring tracks. It no longer is necessary to pre-merge your datasets prior to upload. Data can be loaded in an atomic, singular manner and then ZENBU can perform the mix-and-matching of data sources when users configure their tracks.

A Data Stream Pool can also be though of as the virtual DataSource for a ZENBU track configuration

Dynamic track based data processing

One of unique features of the ZENBU system is the ability to apply data processing and analysis on-demand at query time and as part of the visualization process. This means that raw or unprocessed data can be loaded into the ZENBU system which translates it into the internal Data Model, and then ZENBU can perform many of the data manipulations and analysis that previously required bioinformatics experts with knowledge of the unix command line and a collection of bioinformatics tools.

The data processing system is applied on a track level at query time. This means that no intermediary result needs to be stored in a database or on disk. This allows the user to modify processing parameters and immediately see the effect of the change in the visualization. It also makes the system very fast since data is processed in memory and there is no overhead of reading and writing to slow disks.

Because data processing is applied on each track, and tracks are loaded independently, there is a level of parallelism inherent in the design of the system. The processed data result generated by ZENBU on-demand can also be downloaded into data files for further analysis by external systems like R, BioConductor, or BioPython.

Data processing is controlled through a Scripting system based on chaining Processing modules together in a manner similar to digital signal processing [1]


The Tracks in the ZENBU genome browser fall into three main categories of visualization styles

  • Annotation tracks: where the data sources only contain genomic information and no expression
  • Expression tracks: where expression level is displayed without feature boundaries in a style similar to the UCSC genome browser 'wiggle' but in an user interactive tool.
    LongRNASeq CSH exonic expression.png
  • Hybrid tracks: ZENBU enhanced visualization which allow for processed data to contain both genomic features and multi-experiment expression data
    Hybrid tracks longRNAseq.png

For details on all the different visualization styles and configuration options, please refer to the Track Visualization Styles section of the documentation.

Downloading processed track data

Since tracks contain dynamic pooling of data and data processing, the data output of a ZENBU track may be useful for bioinformaticians to utilize as part of external data analysis. To enable this each track has the ability to export its processed data into local files via the download data control Track controls-download.jpg

For details please refer to the Data Download section of the documentation.

Experiment expression data graph

The "Experiment expression data graph" is a track-linked display form ZENBU expression and hybrid tracks which shows the differential expression between experiments. It works in coordination with the "selected track". Clicking on the title bar or inside a track will select it. ZENBU tracks can be thought of as three-dimension data with experiments, genomic-coordinates and expression values. The track compresses the experiment dimension and the Experiment-expression-graph compresses the spatial dimension.

Visible region collation

By default the experiment-graph selects the entire visible region for compressing spatial information to calculate the expression value for each experiment. The calculation across spatial region is by default a summation, but can be min or max depending on how the track was configured.
Experiment-graph wholeregion.png

Selected region collation

It is also possible for the user to select specific regions within the track and only show the experiment expression under the selection. Selecting a region in a track is performed with simple click-and-drag like selecting text. This interaction with the view and data can be very useful for focusing on specific regions of interest. The experiment-graph updates as the user selects so there is instant feedback for the user.
Experiment-graph selection.png

Hybrid feature selection

For hybrid tracks with genomics-features which have collated expression, selection of an individual feature show the experiment-expression of only that feature.
Experiment-graph feature selection.png

Panel controls

The Experiment-expression graph can moved to a different location in the screen by click-and-dragging on its title-bar. To reset it back to its dock at the bottom of the view, click the "reattach view" widget Experiment-graph widget-reattach.png.

The data currently displayed in panel can be exported into a tab-table with the "export data" widget Experiment-graph widget-exportdata.png. This will pop up a panel with the data which the user can copy-and-paste into excel or another application.

The Experiment-expression graph panel has several configuration options which can be accessed by clicking the "configure" widget Experiment-graph widget-config.png

Experiment-graph configpanel.png

The panel allows for altering the sort-order of the experiments in the view.

  • name : sort based on experiment name. Can also activated by click in the title "experiment name" in the view.
  • expression + strand : sort based on +strand expression from most expressed to least expressed. Also activated by clicking title "forward strand ->" in the panel
  • expression - strand : sort based on +strand expression. Also activated by click title "<- reverse stand" in the panel.
  • expression both strands : sort based on combined expression on both strands. Also activated by click title "< >" in the panel.
  • series/time point : if the experiment is loaded with metadata of tag "eedb:series_name" and "eedb:series_point" this will sort first by time_point and then by series_name within that time point. This can be very useful for time-course datasets like was collected in FANTOM4.
  • series set : if the experiment is loaded with metadata of tag "eedb:series_name" this will sort first by series_name grouping related series together. If "eedb:series_point" is present it will then do a secondary sort on time-point.

This config panel also allows for activating/deactivating individual experiments. This cause the linked expression track to recalculate its experiment summed expression value. For expression signal tracks, the shape of signal will change. This can be considered "soft filtering" of the data-source pool of the track. This accomplished by clicking the activate/deactivate widget next to the experiment name. Experiment-graph activate.png. If the panel option "hide deactived experiments" is selected, then the deactivated experiments are not displayed in the Experiment-graph panel.

Sometimes it is useful to create a deep data-pooled track and then interactively filter it for experiments based on the experiments metadata after loading. This can be useful when doing exploratory work on new datasets. The Experiment filter search allows for searching experiments and deactivating any experiment which does not match the search.

  • search : perform the metadata search to preview which experiments will remain.
  • apply filter : applies the metadata search logic to select active experiments. all experiments not matching the search will be deactivated.
  • clear filter : removes the filter and activates all experiments.

Selecting regions in tracks

All tracks in the genome browser allow for selecting regions. For expression and hybrid tracks this has immediate effects on the Experiment-graph.

In this example we have zoomed out to a 1megabase region of genome and found some highly expressed "introns" so we selected the region around them Track selection.png

This selection also enables several options via widgets at the top of the selection.

  • Track selection-widgets-magnify.png : magnify zooms region into selected region
  • Track selection-widgets-sequence.png : genome sequence returns the genome sequence under the selection. This is only available for genomes which have had their sequence loaded into ZENBU. This will pop-up a panel with the sequence and the user and select and copy-paste the sequence into another a program.

Adding tracks to the view

Creating new tracks

New tracks can be added to views through the configure new track control Create-new-track.jpg which brings up this "configure new track" interface

Create new track panel.jpg

For details on Creating new tracks, please see the Configuring Tracks section of the documentation

Adding Predefined Tracks Into the View

Previous saved and shared tracks can be added into a view of the genome browser through the add predefined tracks control Add-shared-track.jpg. Tracks are one of the types of configurations which can be saved and shared among users. This interface allows these previously saved and shared tracks to be added into ones view. Clicking the "add predefined tracks" button brings uo this control panel

Add predef tracks panel.png

It provides the user with options to search the available tracks, filter based on which collaboration the track was shared with, and then select one-or-more tracks to add to your view

Saving and Sharing Views

Any user who is logged into the system can save and share view configurations with other users. This is done through the save view Save view button.jpg button at the top of the genome browser. This will bring up the "save configuration panel"

Saveview panel.png

When saving a view, users can enter both a "configuration name" and a description of the view. All text entered will be searchable with the metadata search system at a later time in the data explorer.

The user must also select the user collaboration into which the view will be saved and shared with. Views are only shared into a single collaboration, but can latter be moved to a different collaboration through the editing panel which can be accessed in the data explorer views section

Saveview select collaboration.png

After saving, top of the genome browser view will display the new "configuration name" and description and a new unique URL is generated which allows direct access to this view.

Saveview new url title desc.png

Only users who have logged into the system can save and share views with others. Guests will be reminded to login before trying to save.
Saveview user login.png

Exporting View as SVG Image

Export svg panel.png

ZENBU can export the current view visualization as a publication-grade SVG image. Once you are happy with your view, click the "export svg" button. Export svg button.png. This will bring up an interface panel which allows for refinement of how the SVG export is generated.

  • hide widgets : will not draw the widgets of the title bar
    Glyphs track titlebar.png
  • hide track sidebars : will not draw the left side dark gray side bar
  • hide title bar : will not draw the title bar colored background, but will still draw the track title
  • hide experiment/expression graph : will not export the experiment/expression graph part of the display
  • hide compacted tracks : will not export any track which has been compacted to help clean up the final image
  • save to file : enable the export into a file
  • cancel : will cancel the SVG export
  • export svg: will send the SVG either into another window of your web browser or to a file.

Track control widgets

The widgets and controls on each track which enables the user to modify and move the track. The controls are located in the titlebar section of each track . Glyphs track titlebar.png

  • close track Track controls-closetrack.jpg which will delete the track from the view
  • reconfigure track Track controls-reconfigure track.jpg brings up the Reconfigure Track interface which allows users to reconfigure how the track is built (data processing and visualization).
  • copy track Track controls-copytrack.jpg which will duplicate a track in the view. This is very useful when used in combination with track-reconfiguration. It is sometimes easier to take an existing track and modify it to get your desired track-configuration, and this copy tool enables this.
  • download processed data Track controls-download.jpg brings up the Download Track data interface. This enables users to download their processing results into several different file formats for use in external systems or for additional analysis. This panel also provides interfacing for user control of the TrackCache building system
  • activate/deactivate track changes the active state of a track. Glyphs track controls active.png means the track is in an active/expanded state and clicking on the arrow will cause the track to deactivate and not draw its content. Glyphs track controls deactive.png means the track is in a deactivated/compacted state and clicking on the arrow will cause the track to become active.

Genome location navigation controls

There are several ways the user can change the genomic location of the Genome Browser

Glyphs navigation buttons.png

For example one can search for gene symbols, in this case the "Entrez Gene" track has been loaded into the view and the search finds matching genes.

Glyphs feature-search-egr1.png

Or one can search the descriptive metadata, and in this case again it finds matches from the loaded "Entrez Gene" track

Glyphs gene-search-metadata.png

  • direct entry of region location. This can be directly entered into the search box, or one can click the current location text and it will be copied into the search and then edited to the new location. ZENBU supports many different location string formats.

Gylyphs direct coordinate entry.png

For details please refer to the Region Location section of the documentation.