DemultiplexSource

From ZENBU documentation wiki
Jump to: navigation, search

Data Stream Processing > Processing Modules > General manipulation Modules

Description

The DemultiplexSource processing module is designed to de-multiplex barcoded data like 10x genomics single-cell data. The demultiplex key can be anything from a cellID or unique-molecule-ID or any future multiplexing barcode method. At runtime this module will create sub-datasources within the primary DataSource effectively demultiplexing the barcodes from either BAM files or custom processed files (like single-cell CAGE ctss bed files). This makes the loading of single-cell data much easier and efficient since the cell demultiplexing does not have to be performed prior to load generating 10000s of extra files.

Parameters

  • <source_mode> : defines which data-source the demultiplexing is applied to. Possible values are:
    • featuresource : create sub-sources on the primary FeatureSource
    • experiment : create demux sub-sources on the corresponding Experiment
  • <demux_mdkeys> : the metadata key for the Feature metadata which encode the demultiplexing ID.
  • <side_linking_mdkey> : if merging demux metadata from side_sstream, use metadata column type for linking to demux_mdkeys


Example

This script combines DemultiplexSource with FeatureEmitter / TemplateCluster to process single-cell CAGE data as signal-histogram visualization. For this example the single-cell CAGE data was loaded as a BED6 CTSS file where the bed.name column has the cellID and the score column has the CTSS count. Dynamic metadata is linked to the newly demuxed subsources by using a side_stream with an uploaded metadata file and using the CellID column to link with the demux_mdkeys eedb:name.

<zenbu_script>
	<datastream name="cell_mdata" output="full_feature">
		<source id="B9DECA55-F95C-447E-818E-BB2B25291DBF::1:::FeatureSource"/>
	</datastream>
	<stream_processing>
		<spstream module="DemultiplexSource">
			<source_mode>experiment</source_mode>
			<demux_mdkeys>eedb:name</demux_mdkeys>
			<side_linking_mdkey>CellID</side_linking_mdkey>
			<side_stream>
				<spstream module="Proxy" name="cell_mdata"/>
			</side_stream>
		</spstream>
		<spstream module="MetadataFilter">
			<mdata_mode>experiment</mdata_mode>
			<inverse>false</inverse>
			<mdata type="Cluster"/>
		</spstream>
		<spstream module="TemplateCluster">
			<overlap_mode>5end</overlap_mode>
			<expression_mode>sum</expression_mode>
			<ignore_strand>false</ignore_strand>
			<overlap_subfeatures>false</overlap_subfeatures>
			<side_stream>
				<spstream module="FeatureEmitter">
					<fixed_grid>true</fixed_grid>
					<both_strands>true</both_strands>
				</spstream>
			</side_stream>
		</spstream>
	</stream_processing>
</zenbu_script>