Data Explorer Interface
The Data Explorer Interface (or DEX) http://fantom.gsc.riken.jp/zenbu/dex/ is used for searching the data content of the ZENBU system.
It is divided into 4 tabbed subsections
Browsing available Views, Track, Annotations and Experiments
DEX and Metadata searching
Common to all subsections of DEX, is the ability to search and refine the listings using the metadata search system. In addition it is possible to filter results based on which specific collaboration the configuration was saved into, or the Data Source was shared into.
One of the primary systems in ZENBU is that of metadata and metadata searching.
Search is modeled on the google/yahoo approach of prefix-based multiple-keyword searching. ZENBU also provides additional logic elements to fine tune one's queries.
- and : by default space separating keywords in a search is interpreted as an and operation. This operates in the same way as set intersection [1]
- or : is used for combining queries in the same way as set unioning [2]
- not <keyword or ()phrase> : will exclude any items which match the phrase for the results. For example "not spliced" will return experiments which do not have the keyword spliced.
- ! : short hand for not
- ( ) : nesting of parenthesis logic are supported.
It is good practice to always give a good description when saving configurations (view/track/script) or uploading data. ZENBU performs automatic keyword extraction from all metadata providing a wealth of ways to search the system.
Here is a complex example searching the Encode hg19 datasets in the Data Explorer experiment interface section.
encode hg19 and ((rnaseq hepg2 !spliced) or (dnase monocyte))
Other general comments to help users with searching:
- If a search is performed with too many terms, it may fail to return any results. This is the same behavior that google or yahoo has.
- keywords are generally extracted from free-form metadata like names and descriptions but also controlled metadata like the genome assembly, controlled vocabulary, and ontology metadata.
- for OSCtable files, keywords are extracted from the ParameterValue and ExperimentMetadata sections.
- any metadata added into the system via the Metadata editing system becomes immediately available for searching
Refining searches by Collaboration
The data loaded in zenbu can be shared in the context of collaborations.
By default all views, tracks and primary datasources shared in collaborations the user is a member of are searched.
A dropdown menu located on the right side of the keyword search box allows further refine searches or, in the absence of searched keywords, display the entire content related to a specific collaboration.
DEX is also the interface with which owner of a View, Track or Script can change the View, Track or Script associated collaboration (for example moving it from "private" and sharing it with a collaborartion).
Metadata facet browsing of Data Sources
Within the Data Source subsection of DEX, there is an additional metadata search capability provided by a "facet browser" similar to that provided by modencode (http://data.modencode.org). The main advantage of the ZENBU metadata facet browser is that all user defined metadata is available and the system reconfigures itself automatically when new metadata types/values are added into the system by users.
When the user first approaches the DEX Data Source subsection the metadata facet browser is not active. It can be activated by either making a search query or by clicking the "load metadata facet browser" button.
When one makes a search query, the facet browser refines itself to only show metadata which is related to the relevant query to allow the the user to more precisely refine the search to the specific metadata the user is interested in. For example searching for H3K27AC results in Encode data which is specific to that antibody and some data sources which refer to this antibody in their description. In this case the general metadata search finds 98 data source wih "H3K27AC" within their metadata.
But with the facet browser one is able to refine to specifically those data source tagged with enc:antibody=h3k27ac. Clicking in the facet browser in the tag "enc:antibody" or value "h3k27ac" causes the search to refine to those specific 34 data sources. Following the refinement, the user can perform additional search queries or use facet browser to refine the results even more.
Finding Views thru the View tab
A View is a ordered collection of Tracks that can be associated with metadata to facilitate its retrieval, shared in the context of a collaboration, linked to, ... In essence, Views can be considered like interactive figures of a publication.
Because users can save many views at many locations, there are many views in the system. And since new views can be constantly added to the system, there is a need to have a dynamic, searchable interface to these views. All the user adjustable parameters of the ZENBU genome browser, such as the hidden/shown status of tracks, the width of the displayed glyph, ...) are also preserved into the View config. Views can also override any of the default visualization parameters from Tracks it contains.
This section of the Data Explorer allows users to search for views and to page through the results of those searches.
Once the correct view is found, it can then be launched into the ZENBU genome browser by clicking the view button.
When a View configuration is saved, its Title and Description are searchable in the data explorer via the metadata searching system.
Metadata associated to view can be obtained by clicking on the view name.
Creating Views from selected Tracks
Views are ordered collections of Tracks with a Title and Description and every other user adjustable parameters of the ZENBU genome browser.
They can be created by selecting among the list of Tracks already avaialble in ZENBU.
Tracks' Title and Description are searchable in the Data Explorer (DEX) via the metadata searching system.
DEx allows for the search to be narrowed by collaborations.
Note that the order in which track will be displayed mirrors the order in which tracks have been selected
Creating Tracks from primary data sources
Tracks are composed of :
- Data sources: a collection of one or more data sources which are pooled together to form a dynamic merged data source.
- An optional data processing : an optional script of data signal processing modules which manipulates and analyzes the data. This processing can be for either for data visualization purposes or for data exporting and offline analysis.
- Visualization parameters : a set of default visualization parameters which can be used by visualization systems like our ZENBU genome browser when the track is loaded into a View.
Tracks' Title and Description are searchable in the Data Explorer (DEX) via the metadata searching system. Track configurations can be saved and shared in a collaboration or kept private. DEx allows for the search to be narrowed by collaborations.
The ZENBU system allows for the dynamic creation of merged virtual Data Sources referred to as " data stream pools". This provides for a great deal of flexibility both in terms of data loading and data processing. With data pooling, there is no need to load new data every time a different "mix" is needed when configuring ZENBU tracks. One can simply use the data already loaded in the ZENBU system and create a new virtual DataSource mix.
DataSource tab
The DataSource tab lists the following useful information about primary data which has been loaded into the ZENBU system. This section also allows control of DataSource selection to create virtual data pools and to allow the user to build custom tracks with processing and visualization.
- row : row numbering is particularly convenient when paging thru large collection of experiments
- select : tick box to select/unselect annotation(s) from the current data collection to be pooled in a single track
- source name : the name of the data source. The name is automatically parsed to allow for searches in DEX or the ZENBU genome browser
- genome : the genome the annotations / data derive from
- platform : the experimental platform used to provide quantitative measures if provided in the DataSource metadata
- cell line / tissue : the cell line or tissue the experiment / quantitative measures derive from when available / relevant. Keywords are automatically extracted to allow for searches in DEX or ZENBU genome browser
- timepoint : the timepoint the experiment / quantitative measures derive from when part of a time serie analysis. Stored as "eedb:series_name" and "eedb:series_point" metadata on the DataSource.
- treatment : the experimental treatment the cell, cell line or tissue underwent when available / relevant. Stored as "experimental_condition" or "eedb:treatment" metadata on the DataSource.
- description : a description of the data. Keywords are automatically extracted to allow for searches in DEX or the ZENBU genome browser
- source type : class of the DataSource: either Experiment or FeatureSource.
Shopping cart control of primary data pooling
In the upper right corner is "shopping cart" style control panel that display the number of currently selected source being pooled into a track in the making, tracks that have been made and allows to create a view from the collection of tracks.
Views and Tracks can be generated via sequences of
- no DataSources are selected and 'cart' is empty. Starting point.
- Search within the DataSources and select the ones you are interested in. DataSources have been selected and user is ready to configure a new track
- all tracks have been built or selected and user is ready to make the view
The interface for creating a new custom track is identical to the New Track Configuration panel in the Genome Browser interface with the exception that the DataSource selection section does not allow for editing.
Building track from available Scripts from DEX
DEX also enable browsing thru available scripts and to build tracks with simple single-button clicking.
Script searching can be performed using the search box and/or refined by collaboration
DEX script tab lists for each script:
- [build track] button which becomes active once there are data sources added into the cart
- A concise title
- A description, detailing the purpose, thought input and output of the script
- The date it was created date and it's owner
- The number of time the script has been used
Once the desired script is found, the data sources in the cart can be built into the track specified by the processing and visualization parameters of the script. Simply click the build track button of the script and ZENBU will handle all the track building steps for you.
This view also enables users to modify the the metadata or shared collaboration of the scripts they have created via DEX by clicking on the script's name or the edit button.
Script editing can be done either from scratch or by modifying an existing script.
Changing collaborations and editing metadata from DEX
Details about how to edit metadata can be found in the metadata editing section.
In short, metadata editing from dex is available either :
By clicking on the name/title of the View, Track, Annotation, Experiment or Script
- Views' metadata look up example
(in this example, edition is not allowed this we are not its creator)
Or
Using the dedicated edit button located in the column listing the data ownership & date created/uploaded
Note that only owner of the Views, Tracks, Annotation, Experiments or Script can freely edit its metadata.
- Scripts' metadata editing example
(in this example, being the creator of the script allows to edit its metadata)
DEX is also the interface with which owner of a View, Track or Script can alter the associated collaboration (for example moving it from "private" and sharing it with a collaborartion).
Edit your own copy of a View, Track or Script
It is always possible and very easy, via the ZENBU genome browser interface, to make a copy of any Views, Tracks or Scripts.
This copy being under your name it is then simple to freely edit its metadata as you see fit.