Difference between revisions of "Track visualization styles"
(→Express signal visualization) |
(→Express signal visualization) |
||
Line 91: | Line 91: | ||
'''The expression binning processing GUI parameters are as follows :''' | '''The expression binning processing GUI parameters are as follows :''' | ||
* '''overlap mode''' : since ZENBU can work directly with sequence alignment data (often uploaded from [[BAM]] files) it is necessary to modify the alignments to be properly visualized. The options here are:: | * '''overlap mode''' : since ZENBU can work directly with sequence alignment data (often uploaded from [[BAM]] files) it is necessary to modify the alignments to be properly visualized. The options here are:: | ||
− | + | **'''''area under the curve''''': the expression is spread evenly along the length of the alignment so that the area-of-the-curve represents the level of expression.<br> ''This only effects alignments which overlap more than one of the genomic segmentation bins. If all alignments are shorter than the genomic segmentation then ''area'' and ''height'' modes generate the same visualization.'' | |
− | **'''''area under the curve''''': the expression is spread evenly along the length of the alignment so that the area-of-the-curve represents the level of expression. ''This only effects alignments which overlap more than one of the genomic segmentation bins. If all alignments are shorter than the genomic segmentation then ''area'' and ''height'' modes generate the same visualization.'' | + | **'''''height''''': the expression is collated so that the height of curve represents the level of expression at the genomic segment. <br>''This only effects alignments which overlap more than one of the genomic segmentation bins. If all alignments are shorter than the genomic segmentation then ''area'' and ''height'' modes generate the same visualization.'' |
− | **'''''height''''': the expression is collated so that the height of curve represents the level of expression at the genomic segment. ''This only effects alignments which overlap more than one of the genomic segmentation bins. If all alignments are shorter than the genomic segmentation then ''area'' and ''height'' modes generate the same visualization.'' | + | ** '''''5'end''''': the expression signal is concentrated at the 5'end of the sequence alignment prior to being collated into the genomic segmentation binning. <br>This is primary used for CAGE-based sequencing experiments |
− | **'''''3'end''''': the expression is concentrated on the 3'end of the sequence alignment prior to being collated into the genomic segmentation binning. Currently there are few RNA sequencing technologies which can utilize this mode of processing but is included for new technology development. | + | **'''''3'end''''': the expression is concentrated on the 3'end of the sequence alignment prior to being collated into the genomic segmentation binning. <br>''Currently there are few RNA sequencing technologies which can utilize this mode of processing but is included for new technology development.'' |
*'''expression binning''': the mathematical operation used when multiple expression from the same experiment collate into the same genomic segmentation bin. Each Experiment is kept distint and this math is applied across different expression features within the same Experiment. The options are: | *'''expression binning''': the mathematical operation used when multiple expression from the same experiment collate into the same genomic segmentation bin. Each Experiment is kept distint and this math is applied across different expression features within the same Experiment. The options are: |
Revision as of 02:13, 31 October 2012
The Tracks in the ZENBU gLyphs genome browser fall into three main categories of visualization styles
- Annotation tracks: where the data sources only contain genomic information and no expression
- Expression tracks: where expression level is displayed without feature boundaries in a style similar to the UCSC genome browser 'wiggle' but in an user interactive tool.
- Hybrid tracks: ZENBU enhanced visualization which allow for processed data to contain both genomic features and multi-experiment expression data
Annotation Tracks
Annotation tracks are for the visualization of genomic positional data within the ZENBU genome browser.
The visualization style of visualization can be changed in the track configuration interface panel's Visualization section.
Different visualization styles vary in the amount of information displayed and the amount of vertical screen space used. For dense data tracks, more compact visualization my be better depending on how one will use the visualization. The strand of the annotation is color coded where green is on the forward(+)strand and purple is on the reverse(-) strand.
Expression Tracks
Expression tracks are for the visualization of numerical expression data from Experiment Data Sources in the ZENBU genome browser on a segmented genomic grid similar to the UCSC genome browser wiggle visualization.
Express signal visualization
In the express visualization, expression is visualized as a signal-height graph along genomic coordinate space. ZENBU can visualize both strandless and stranded expression signal.
Here is an example of FANTOM4 CAGE signal which is stranded in nature
Here is an example of FANTOM4 ChipCHiP signal which is strandless in nature
Here is an example of ENCODE strandless-protocol RNAseq signal configured to only display expression signal in areas of sequence alignment (skipping gaps of alignments)
http://fantom.gsc.riken.jp/zenbu/gLyphs/#config=l_D-jGt1IlehEahizVAMeB;loc=hg19::chr8:128746973..128755020
Here is an example of ENCODE stranded-protocol RNAseq exonic expression signal
In order to create this style of visualization the primary expression data must be processed using either the graphical interface expression binning script GUI processing modules or with a custom data processing script to create the dynamic genomic segmented grid.
The expression binning processing GUI parameters are as follows :
- overlap mode : since ZENBU can work directly with sequence alignment data (often uploaded from BAM files) it is necessary to modify the alignments to be properly visualized. The options here are::
- area under the curve: the expression is spread evenly along the length of the alignment so that the area-of-the-curve represents the level of expression.
This only effects alignments which overlap more than one of the genomic segmentation bins. If all alignments are shorter than the genomic segmentation then area and height modes generate the same visualization. - height: the expression is collated so that the height of curve represents the level of expression at the genomic segment.
This only effects alignments which overlap more than one of the genomic segmentation bins. If all alignments are shorter than the genomic segmentation then area and height modes generate the same visualization. - 5'end: the expression signal is concentrated at the 5'end of the sequence alignment prior to being collated into the genomic segmentation binning.
This is primary used for CAGE-based sequencing experiments - 3'end: the expression is concentrated on the 3'end of the sequence alignment prior to being collated into the genomic segmentation binning.
Currently there are few RNA sequencing technologies which can utilize this mode of processing but is included for new technology development.
- area under the curve: the expression is spread evenly along the length of the alignment so that the area-of-the-curve represents the level of expression.
- expression binning: the mathematical operation used when multiple expression from the same experiment collate into the same genomic segmentation bin. Each Experiment is kept distint and this math is applied across different expression features within the same Experiment. The options are:
- sum : sum the different expression values within each experiment
- min : calculate the minimum value of the different expression values of each experiment
- max : calculate the maximum value of the different expression values of each experiment
- mean : calculate the mean average of different expression values of each experiment
- count : simply report the count of different expression values within each experiment that collate into the genomic segmentation bin.
- fixed bin size: by default the processing script creates dynamic bin sizes based on the zoom level of the genomic view and the width of the display in order that each segmentation bin maps approximately to a single pixel width on the screen. This ensures that a fine enough visualization resolution is preserved without creating unneeded sub-pixel resolution. But if a finer or courser segmentation binning is desire it can be entered here. For example the track above using a 100base pair fixed binning size.
- process ignoring strand: if the primary expression experiments are using a strandless protocol or one wishes to process stranded expression in a strandless manner , check this and a strandless genomic segmenation binnning grid will used and strand of the primary data will be ignored. It the data is processed as strandless it is best to also select the strandless option within the visualization options.
- overlap via subfeatures: sometimes RNA sequencing experiments generate gapped sequence alignments when an RNA molecule spans an intronic splicing junction. This information is contained in BAM files and is preserved durring ZENBU uploading. To get an accurate visualization of true RNA exonic signal these intronic gaps should not be collated into the genomic segmentation bins. The example above of the ENCODE Wold lab RNAseq experiments contain such gapped alignments. Here is this BAM sequence alignment data processed without this option enabled and both RNA exon and intron signal is collated into the expression visalization.
Additional visualization options available for expression Experiment tracks (visualization style of express)
- hide empty experiment: this parameter effects the track-linked Experiment Expression panel. If selected, only those Experiments with a non-zero expression value are displayed.
- color expression: currently has no effect when the track is in express' mode
- display datatype: depending on how the track was configured and processed there may be more than one datatype available for visualization. If more than one is available, please select.
- background color: the option of altering the background color to help visually group related tracks in very large views. color can be specified using any of the html web color syntaxes (named colors, #FFFFFF style or rgb(255,255,255) style).
- track pixel height: adjusts the screen height of the track. this can also be adjusted with the resize widget on the left side of the track with click-drag.
- express scale: adjusts the numerical scale which the expression values are displayed. by default this is auto meaning that the expression track is visually rescaled to fit into the height of the track. If one desires to use a fixed scaling among several tracks, this can be set here. Tracks with more expression than this scale limit are clipped.
- log scale: for visualization the expression can be dynamically compressed onto a log scale. If the expression has huge dynamic range, this can be helpful to expand the low background signal and compress the higher peaks. For example here is the FANTOM4 CAGE expression track from above visualized on a log scale.
- strandless: this visualization option should be set in coordination with data processing which is also strandless.
Expression spectrum visualization
The same expression-binning data processing can be also displayed in the spectrum visualization style. This is a visualization style for expression data but can also be applied to non-overlapping hybrid data. It draw the expression on a single layer of a track using only the false-color-spectrum to visualize expression differences. It can be used in combination with normal "expression binning" processing or with more advanced scripts. The spectrum visualization does not display strand information so processing should be done in a strandless or separated-strand manner using custom scripting.
This example RNAseq data is processed to display only exonic signal (no gaps/introns) and displayed with the "blue1" false-color-spectrum. This style of visualization gives a very compact track which allows people to use it in situations where they might need many separate expression tracks.
Hybrid Tracks
ZENBU advanced visualization tracks which combine genomic annotation and expression Experiments, often in combination with ZENBU data processing to create novel visualizations. There are several different types of visualizations which can be categorized as hybrid tracks.
Expression false coloring of genomic features
In this style of hybrid visualization, genomic annotation Features have expression collated onto them. This can either be generated inside ZENBU by a data processing script or by utilizing the BED file with score-as-expression loading options or with OSCtable files with combined annotation and expression. This visualization is enabled by selecting one of the annotation visualization styles, checking the color expression box and selecting a false color spectrum.
For example here is a track which uses ZENBU data processing of the ENCODE wold-lab RNAseq expression (which was loaded via BAM files) collated into Gencode Gene models to give gene expression. This data processing is then visualized as a hybrid track with the transcript visualization stlye and the color expression option with the fire1 false-color-spectrum.
The top track is the hybrid track showing the processed gene expression, and the two tracks below are the RNAseq expression signal track which was then collated into the Gencode gene models which are shown in the third track. For details on how to create tracks like this, please see the case study RNAseq_expression_collated_onto_gene_models
Here is a variation on the previous collated-expression situation, but here we use advanced scripting to dynamically generate new genomic-features from the primary data and then use false-color-spectrum to show their abundance.
In this track RNAseq alignment gaps are extracted by ZENBU processing into new genomic-features and then "uniqued" and counted. In gapped RNAseq, long gaps mainly occur because of RNA spanning introns and these gaps represent evidence for introns. These "intron evidence" features are then filtered for length and minimum abundance before being displayed using "medium-exon" and a "fire1" spectrum.
Experiment Heatmap visualization
This is a visualization style for datasource pooled tracks with many experiments and expression. In this style of visualization each experiment is given a unique horizontal layer in the image, vertical slices represent genomic segments, and the false-color-spectrum is applied to the expression value at the intersection of genomic-position and experiment. This style of visualization simultaneously shows spatial variation in expression and differential expression between experiments.
In this example the RNAseq is processed for exonic signal and binned into a genomic-segmentation grid and experiments are sorted based on most expression value.
Hovering over elements in the heatmap reveals the name of the experiment, the location and the expression value collated into that genomic segment.
The order of experiments matches the order in the linked Experiment-expression graph and resorting in that panel, changes the sort-order of experiments in the heatmap. Here the sort order is changed to be by sample cell-type name.
Another example showing more dramatic differential expression and spatial difference between RNAseq exonic signal among different ENCODE samples.