RNAseq expression collated onto gene models

From ZENBU documentation wiki
Jump to: navigation, search

In this example we will show how you can take RNAseq sequence alignments from BAM files, treat them as RNA expression and collate that expression onto gene models to get gene expression for data download, or visualization.

Here is the final set of tracks we will be building. The top track is our target track which uses ZENBU data processing of the ENCODE wold-lab RNAseq expression (loaded via BAM files) collated into Gencode Gene models to give gene expression. This data processing is then visualized as a hybrid track with the transcript visualization stlye and the color expression option with the fire1 false-color-spectrum.
Hybrid track gencode gene RNAseq expression.jpg
The top track is the hybrid track showing the processed gene expression, and the two tracks below are the RNAseq expression signal track which was then collated into the Gencode gene models which are shown in the third track.

Loading RNAseq data from BAM files

ZENBU can work directly with BAM sequence alignment files so there is no need to post-process your alignments prior to loading into ZENBU. You can align your RNAseq sequences to the genome using common alignment programs like TopHat or BWA or the sequence alignment pipelines provided with your sequencing instrument. If you are having your RNAseq sequenced at a sequencing service, they will most likely be giving you your results data in the for of BAM files. Simply upload these BAM files into ZENBU for use in creating tracks like this. Your RNAseq BAM alignments can be reused for other ZENBU data processing and visualization so there is only a need to upload them once.

Creating your track

The next step is to launch the genome browser with a starting view of the genome of interest. In our example we will be working on the human hg19 and here is a simple view we can start with which includes the "Gencode V10 gene model track which we will be collating our RNAseq expression into.
http://fantom.gsc.riken.jp/zenbu/gLyphs/#config=7x5Wj-0HHFQuqDbjStXITB;loc=hg19::chr19:50161252..50170707

From the genome browser now, select the configure new track Add-track.jpg button.

Selecting your RNAseq Experiments

In the Data Sources section of the new track panel, search for the ENCODE Wold lab RNAseq experiments. We have already loaded these so for this example you do not need to reload them. In this case the Wold-lab ENCODE RNAseq data also contains some post processed BAM files which only contain reads which span splicing junctions. ZENBU can work with the complete set of alignments so we can skip these "spliced alignments". To find the experiments we want search with the terms "encode wold rnaseq !splice". This should return 47 different experiments. The click the "select all".
Wold rnaseq track search.jpg
you can play around with different search phrases to see what other encode data is in the system. But for our example lets work with these 47 experiments.

Applying gene collation script

TODO

Visualization

In this style of hybrid visualization, genomic annotation Features have expression collated onto them. This can either be generated inside ZENBU by a data processing script or by utilizing the BED file with score-as-expression loading options or with OSCtable files with combined annotation and expression. This visualization is enabled by selecting one of the annotation visualization styles, checking the color expression box and selecting a false color spectrum.
Hybrid track colored features config.jpg