TrackCaching System

From ZENBU documentation wiki
Jump to: navigation, search

Track Caching system

To enable fast and reliable downloading of processed data and to speed up visualization of processed track we implemented a TrackCache system based around a new binary file format called ZDX (Zenbu Data eXchangeformat).

The TrackCache is based on the concept of unique track description. Scripts are parsed such that their sole effective content (that is regardless of their formatting, the unnecessary declaration of parameters with the default value, ...) matters. This means that different people starting from scratch building tracks can generate the same "track description" (similar datasources and processing but different track title, indentation of the script, addition of comments or annotation with the script) and behind the scenes use the same trackcache.

ZENBU system overview

ZENBU system flow.png

A flow diagram showing an overview of ZENBU server-side query flow, data streaming and stream processing and how the TrackCache and ZDX files are used by the system

ZDX (Zenbu Data eXchange) format binary file

To enable fast and reliable downloading of processed data and to speed up visualization of processed track we implemented a TrackCache system based around a new binary file format called ZDX (Zenbu Data eXchangeformat). The TrackCache system effectively acts as a "query cache" for retrieving processed data for tracks. ZDX was designed to provide fast random access read/write of ZENBU results into files to enable the TrackCache system to have not only fast read performance, but also fast write performance.

The ZDX file is based on the concepts of filesystems with File-allocation-tables and inodes and file-blocks. In ZDX data is stored into znodes, but unlike filesystems where every file-block on the disk is the same size, znodes have a flexible size. The ZDX file header allows for many different subsystems of data to be stored in the same ZDX file.
The main purpose of ZDX is (like a filesysstem) to allow not only fast random access, but also to allow random-access writing of blocks of data into the ZDX file just like one can add a new file into a directory of a filesystem without the filesystem having to reorganize other blocks on the disk. ZDX is a single file binary format and does not use directories or organize itself. Instead the ZDX file uses the concepts of filesystems like File-Allocation-Tables [1], inodes [2] and blocks [3] to allow for structuring data inside the ZDX file thus allowing for random-access writing of data into the ZDX file. This is in contrast to other binary file formats in genome science which are linear in nature and inserting data into such files requires the files to be re-sorted and re-indexed.

The ZDX design thus allows for the genomic data inside the ZDX file to be always sorted even when it is partially built.
Features with expression and metadata are stored in the ZDX segments as compressed ZENBU xml using the LZ4 compression algorithm. This provides very fast compress and decompress times with still excellent compression ratios. We do not want to waste too many CPU cycles on compression/decompression. This ensures very fast read/writing of data into the ZDX segments even though it it compressed.

The main sections of how we use ZDX files in our TrackCache is

  • a section dedicated to the DataSources of the track
  • a pre-segmented genome (TrackCache uses a 100kb non-overlapping segment)
  • a sorted Feature/Expression array attached to each segment.

Because the genome is presegmented, it is possible with ZDX to independently build each segment. As a segment is built, the data is written into a znode and appended onto the end of the ZDX file. Since everything is done with inode-like znode pointers, the actual location of the znode in the file is irrelevant. When building a TrackCache segment the TrackCacheBuilder will create a linked-list of znodes where each znode is kept around 200kb. Because the TrackCacheBuilder builds one-segment at a time and the features come out of the ZENBU streaming in sorted order, the writing into the segment is in sorted order. Therefore the ZDX file is always in sorted order, there is never a need to resort the entire file. The ZDX file built with very efficient locking so that 100s of TrackBuilders can be working on creating the same ZDX file at the same time and are only limited by the disk performance of the system.

For more technical details on the ZDX file design please check out the ZDX file wiki page.

ZDX enables Map/Reduce parallelization

Because of the file-system-like design approach in ZDX, we are able to randomly build different parts of the file at the same time.

Since TrackCache ZDX has a presegmented genome, it naturally enables MapReduce style building of the complete genome. And the order of segment building does not matter. The TrackCacheBuilders use the ZENBU API for data streaming and data processing so generate the same result as the webservices.

eHive based system of autonomous-agent and work-claim design

The TrackCacheBuilders follow an autonomous-agent and work-claim design originally developed in the eHive system. TrackCacheBuilders do not need to be told what to do, they check a black-board database (like ehive) for trackcache's which are unbuilt and for user requests for region building. Once they have initialized to a particular TrackCache, they can either build user-requested segments or randomly pick an unbuilt segment. Like eHive the autonomous-agent TrackCacheBuilder workers first lock-and-claim a segment (fast no-race condition) and then proceed to build the segment at whatever pace the dataprocessing allows. This enables 100s-1000s of workers to simultaneously work on the same ZDX TrackCache without colliding (each segment is built only once). This is very efficient and completely autonomous. Because the granularity of genome TrackCache building is on the 100kilobase segment size, the system has very good latency between when a user makes a request for a segment to be built and when a free TrackCacheBuilder worker can finish it's current segment and pick up the job request to build another segment.

Just like eHive workers are given a limited lifespan before they "die" and are reborn. This also gives a layer of fault tollerence to the system (like eHive). If worker dies mid-build the segment is labeled as still mid-build with the workers process-ID, it is possible to identify the failure and reset the segment so that another worker can build it. The workers have failure code built-in so many fail-states are caught and the worker can reset the segment before needing to die. In addition all sorts of building stats are recorded for each segment. number of features, build time, worker processID, host machine name... Because of the design of the system it is very easy to have a cluster of computers running 100s of TrackCacheBuilder workers to enable high-degree of parallelization for TrackCacheBuilding. DataDownload is enabled once the requested segments have been built. This means that download can be enabled without requiring the track for the whole genome to have been build. This enhances the user-response experience.

This provides the complete flexibility of the ZENBU data processing and data pooling system.

Because the result of processing is stored in the ZDX TrackCache as ZENBU datamodel XML, it is read back out of the cache fully intact and this able to be reused for further ZENBU data processing and output. This enables the same track cache to download data into many different export formats (bed, gff, osctable). This allows for the web interface to provide much flexibility on data download and still use the same TrackCache/ZDX. And the same TrackCache/ZDX can be used for fast data query for the track visualization system and really enhances the user experience of using the genome browser.