Data Download

From ZENBU documentation wiki
Jump to: navigation, search

Data from any Track in the ZENBU system can downloaded into several different output formats. Downloading is available on any track configuration whether it is a simple single Data Source, a track with a virtual pool of data sources or a complex track with processing.

To enable fast downloading for the users, Tracks are built in parallel on a computing cluster utilitize the TrackCaching_System. This provides many benefits. If I user has selected a predefined/shared track, the TrackCache will already be prebuilt and download is immediately available. If a user configures a new track, but happens to define it with the same content (same data pool' and script) as a previous track in the TrackCache, it will use that TrackCache and its already built segments. Another advantage of the TrackCache system, is that the track is built in parallel segments so data download becomes immeadiately available as soon as the request region segment is built, and there is no need to wait for the entire track built.

The interface for track data download is available through the ZENBU genome browser interface and the "track download widget" Track controls-download.jpg.
Clicking on this widget will bring up the "download track data" panel.

Download Track Data panel

Track download panel.jpg

The Track Data download panel provides both an interface to the Data Download system and the TrackCache building system. Data download only becomes available once the genome segments for the track have been built into the TrackCache.

Once the TrackCache data is available, the user can select the file format for download
Track download file-options.jpg
Download file formats include: OSCTable, BED12, BED6, BED3, GFF, zenbu xml, and DAS xml. There is the option to "save to file" or to see the result in a web browser window.

The OSCTable file option contains several different additional options controls on how to generate the file. The OSCTable is a general purpose tabbed-text table file with additional textual elements to help with loading and parsing. In its most basic form OSCTable is no different from an Excel table with header names. If one chooses OSCTable and do not export the oscheader metadata nor the experiment metadata then the output file is completely Excel and R friendly. OSCTable is the preferred method for data export since it ensures that all the data of the track is exported including multi-experiment-expression on genomic features. For multi-experiment expression, the different experiments and datatypes are simply appended into additional columns. The genomic-coordinate data for OSCTable follows the BED6 and BED12 formats. So a ZENBU OSCTable export can be considered a BED6/BED12 with additional columns for Feature-metadata or multi-experiment expression which makes it generally usable by many bioinformatics systems and pipelines.

Track building interfaces

Since the data download system relies on the TrackCache system, the Data download panel also serves as the interface to help prioritize track building. The TrackCache building system is automated and runs in the background. As new tracks are created by users they are inserted into the TrackCache system. Anytime a genome browser view tries to access a track for visualization, it will first try to get the data from the TrackCache. If the TrackCache segment is not built, the webservice will log an "anonymous build request" so that this region can be visualized faster next time a user comes to the region. The long ID in gold the is unique hashkey of the track.

Users can track their download requests

When users are logged into the ZENBU system, they can make personal requests to the track building system which will not only help prioritize their tracks of interest, but also is logged into their personal Downloads page. Users can check the progress of their Download building from both this interface and through their personal Downloads page.

For example here is a new track which is still not completely built for chr13, and the interface offers the user the option to make a ""build request" which will then be logged into their Downloads page.
Track download unbuilt.jpg

And after the request has been made the user will see this in the track downsloads panel.
Track download already requested.jpg

If users go to their User "download tab" they will see the status of their previous download/track-build requests. Tracks which are currently building will display a percentage complete. Track-build-requests which have completed, offer a <download> button which will then bring up the "Download Track data" panel
User downloads panel.jpg

Anonymous TrackCache building requests

If users are not logged into the ZENBU system (guests), they can make anonymous requests to the trackbuilding system to help prioritize their regions of interest.
Track download guest request.jpg

If the TrackCache building system already has workers building on the region, guests will simply see the following and need to check back periodically if the building has completed
Track download build in progress.jpg