PICASSO 3-color exampleο
This tutorial documents the interactive script
user_scripts/unmix_picasso_3color_example.py.
It is the most direct comparison point between:
the original 3-channel MATLAB PICASSO workflow,
the explicit N-channel generalization,
and the source-sink formulation.
The PICASSO-family methods documented here are motivated by the original PICASSO paper:
Seo, J., Sim, Y., Kim, J. et al. PICASSO allows ultra-multiplexed fluorescence imaging of spatially overlapping proteins without reference spectra measurements. Nature Communications 13, 2475 (2022). https://doi.org/10.1038/s41467-022-30168-z
How to use this tutorialο
The script is designed for interactive execution.
The recommended workflow is:
open
user_scripts/unmix_picasso_3color_example.py,inspect the measured channels first,
compare the three different blind-unmixing implementations on the same stack.
The subsections below follow the same order as the script.
Importsο
from __future__ import annotations
from pathlib import Path
PROJECT_ROOT = Path(__file__).resolve().parents[1]
from spectral_unmixing import report_path_from_output_path, unmix_picasso
from spectral_unmixing.viewer import show_all_channels_in_napari
Define input and output pathsο
INPUT_PATH = (PROJECT_ROOT / "example_data" / "PICASSO_examples" / "3_color_data.tif")
#INPUT_PATH = (PROJECT_ROOT / "example_data" / "PICASSO_examples" / "m1_e0_GFAPgreenDRAQmagenta.tif")
#INPUT_PATH = (PROJECT_ROOT / "example_data" / "PICASSO_examples" / "picasso_mouse_spinalcord_small.tif")
#INPUT_PATH = (PROJECT_ROOT / "example_data" / "PICASSO_examples" / "GFAP_sink_LMNB1_source.tif")
INPUT_NAME = INPUT_PATH.stem
OUTPUT_DIR = INPUT_PATH.parent / "unmixed"
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
Inspect the measured channelsο
show_all_channels_in_napari(INPUT_PATH, layer_prefix="3-color simulation")
Before any unmixing is applied, the script opens the measured stack in napari. This is useful for confirming channel order and for building intuition about which channels appear to contaminate which others.
matlab_n blind unmixingο
OUTPUT_PICASSO_MATLAB_N = OUTPUT_DIR / f"{INPUT_NAME}_picasso_matlab_n.tif"
picasso_matlab_n_output = unmix_picasso(
input_path=INPUT_PATH,
output_path=OUTPUT_PICASSO_MATLAB_N,
channels=[0, 1, 2],
# method="picasso", # default
implementation="matlab_n", # "matlab_3c" or "matlab_n" or "source_sink_n"
# alpha_mode="reference_t",
# alpha_reference_t=0,
background_percentile=1.0,
# preprocess_alpha_inputs=True, # recorded for compatibility
mi_bins=64,
alpha_max=1.0,
max_iter=50,
tolerance=1e-4,
max_alpha_voxels=250_000,
step_size=0.2,
qn=100,
pixel_bin_size=16,
alpha_clip=0.5,
# negativity_threshold=0.0009,
# clip_every_n_iterations=50,
random_state=42,
clip_negative=True,
output_dtype="float32",
verbose=True,
)
print(picasso_matlab_n_output)
print(report_path_from_output_path(picasso_matlab_n_output).read_text(encoding="utf-8"))
show_all_channels_in_napari(picasso_matlab_n_output, layer_prefix="PICASSO MATLAB-N unmixed 3-color simulation")
This method keeps the MATLAB-style iterative logic, but uses the generalized
matlab_n implementation rather than the strict 3-channel port.
The settings that matter most are:
channels: selects which measured channels participate in the blind-unmixing run.implementation="matlab_n": chooses the generalized MATLAB-style path rather than the strict three-channel port.background_percentile: low-percentile background estimate used during preprocessing before the update sequence is estimated.preprocess_alpha_inputs: enables or disables that shared preprocessing step.mi_bins: retained in the shared API and JSON report for compatibility, but not used by the MATLAB-like implementation itself.alpha_max: likewise retained in the shared API and JSON report, but not used directly by the MATLAB-like implementation.max_iter: number of iterative updates. More iterations can unmix more strongly but may also become less stable.tolerance: convergence threshold for the iterative update sequence.max_alpha_voxels: optional voxel cap for large stacks. Lower values speed up the run; higher values use more of the measured image data.step_size: strength of each update step. Larger values are more aggressive; smaller values are more conservative.qN: quantization parameter for the MATLAB-style mutual-information estimate.pixel_bin_size: spatial binning factor before the mutual-information calculation. Larger values smooth more strongly; smaller values preserve more detail.alpha_clip: clipping bound for pairwise coefficients. Larger values allow stronger pairwise subtraction.optional
negativity_thresholdandclip_every_n_iterations: control how intermediate negativity is monitored and how often positivity enforcement is applied.random_state: random seed used whenever voxel subsampling is needed.clip_negative: clips final negative values in the written output to zero.output_dtype: controls the saved data type of the unmixed stack.verbose: enables terminal progress output during the run.alpha_modeandalpha_reference_t: become relevant for real multi-time-point stacks.reference_testimates one update sequence from one chosen time point;per_testimates one sequence per time point.
This is the best choice when you want MATLAB-like behavior but also want the same conceptual workflow to scale to larger channel counts.
Composite view of the unmixed three-channel stack after the MATLAB-N blind-unmixing run (top). The MATLAB-N implementation has successfully removed most of the cross-talk between the channels, which are now cleanly separated.ο
matlab_3c blind unmixingο
OUTPUT_PICASSO_MATLAB_3C = OUTPUT_DIR / f"{INPUT_NAME}_picasso_matlab_3c.tif"
picasso_matlab_3c_output = unmix_picasso(
input_path=INPUT_PATH,
output_path=OUTPUT_PICASSO_MATLAB_3C,
channels=[0, 1, 2],
# method="picasso", # default
implementation="matlab_3c", # "matlab_3c" or "matlab_n" or "source_sink_n"
# alpha_mode="reference_t",
# alpha_reference_t=0,
background_percentile=1.0,
# preprocess_alpha_inputs=True, # recorded for compatibility
mi_bins=64,
alpha_max=1.0,
max_iter=50,
tolerance=1e-4,
max_alpha_voxels=250_000,
step_size=0.2,
qn=100,
pixel_bin_size=16,
alpha_clip=0.5,
# negativity_threshold=0.0009,
# clip_every_n_iterations=50,
random_state=42,
clip_negative=True,
output_dtype="float32",
verbose=True,
)
print(picasso_matlab_3c_output)
print(report_path_from_output_path(picasso_matlab_3c_output).read_text(encoding="utf-8"))
show_all_channels_in_napari(picasso_matlab_3c_output, layer_prefix="PICASSO MATLAB-3C unmixed 3-color simulation")
This is the closest available Python analogue of the original MATLAB three-channel workflow.
Use this mode when exact three-channel comparison to the classic PICASSO logic is more important than having a unified N-channel code path.
The main settings remain the MATLAB-style iteration parameters:
channels: same role as above, but here exactly three selected channels are required.implementation="matlab_3c"background_percentilepreprocess_alpha_inputsmi_binsandalpha_max: same shared API parameters as above; recorded but not used directly by the MATLAB-like implementationmax_itertolerancemax_alpha_voxelsstep_sizeqNpixel_bin_sizealpha_clipoptional
negativity_thresholdandclip_every_n_iterationsrandom_stateclip_negativeoutput_dtypeverbosealpha_modeandalpha_reference_tfor multi-time-point stacks: same behavior as in thematlab_nexample above
Also the MATLAB-3C implementation has successfully removed most of the cross-talk between the channels, which are now cleanly separated. The results are very similar to the MATLAB-N run above. In fact, the latter is a generalized version of the former, so the results are expected to be nearly identical.ο
source_sink_n blind unmixingο
OUTPUT_PICASSO_SOURCE_SINK = OUTPUT_DIR / f"{INPUT_NAME}_picasso_source_sink.tif"
sink_channels = [0,1,2]
neutral_channels = []
# If you want to ignore the possible ``channel 2 -> channel 1`` contribution
# and model only the clearly suspected ``channel 0 -> channel 1`` case, make
# channel 2 neutral instead:
#
# neutral_channels = [2]
picasso_source_sink_output = unmix_picasso(
input_path=INPUT_PATH,
output_path=OUTPUT_PICASSO_SOURCE_SINK,
channels=[0, 1, 2],
# method="picasso", # default
implementation="source_sink_n",
# alpha_mode="reference_t",
# alpha_reference_t=0,
sink_channels=sink_channels,
neutral_channels=neutral_channels,
# source_sink_matrix=[[1, 0, 0], [-1, 1, -1], [0, 0, 1]],
background_percentile=1.0,
#preprocess_alpha_inputs=True, # recorded for compatibility
mi_bins=64,
alpha_max=1.0,
max_iter=50,
tolerance=1e-4,
max_alpha_voxels=250_000,
source_sink_optimize_background=True,
source_sink_max_background=0.2,
source_sink_n_restarts=6,
source_sink_joint_optimization=True,
random_state=32,
clip_negative=True,
output_dtype="float32",
verbose=True,
)
print(picasso_source_sink_output)
print(report_path_from_output_path(picasso_source_sink_output).read_text(encoding="utf-8"))
show_all_channels_in_napari(picasso_source_sink_output, layer_prefix="PICASSO source-sink-N unmixed 3-color simulation")
This variant uses an explicit source-sink description of the expected bleed- through graph.
In the current implementation, all sources contributing to one sink are optimized jointly by default, and the workflow can additionally estimate a small background offset for each modeled source-sink relation. This brings the method closer to the source-sink formulation used by the napari PICASSO plugin, while still relying on histogram-based mutual information rather than the pluginβs neural MINE estimator.
The most relevant settings are:
channels: as above, selects which measured channels participate in the explicit source-sink run.implementation="source_sink_n": chooses the source-sink PICASSO-family workflow rather than the MATLAB-like implementations.sink_channels: defines which channels should be corrected as sinks.neutral_channels: defines which channels should remain untouched and not act as sinks.optional
source_sink_matrix: gives explicit manual control over the allowed bleed-through graph.background_percentile: same role as above, but now used in source-sink coefficient estimation.preprocess_alpha_inputs: same shared preprocessing switch as above.alpha_max: now actively used as the upper bound for source-to-sink coefficients.mi_bins: now actively used as the histogram resolution for the mutual-information objective.max_iter: controls the maximum number of optimizer iterations for each sink.tolerance: stopping tolerance for the numerical optimizer.max_alpha_voxels: same optional voxel cap as above.source_sink_optimize_background: if enabled, estimate one small background offsetbetaper modeled source-sink relation before subtracting the source contribution.source_sink_max_background: upper bound for these optimized background offsets on normalized intensities.source_sink_joint_optimization: if enabled, optimize all sources contributing to the same sink together instead of fitting them greedily one after another.source_sink_n_restarts: number of multi-start optimizer initializations used for each sink.random_state: same random seed used for optional voxel subsampling.clip_negative: same final clipping behavior as above.output_dtype: same output dtype control as above.verbose: same terminal verbosity control as above.alpha_modeandalpha_reference_t: same time-axis controls as above for real multi-time-point stacks.
This is often the easiest mode to reason about biologically, because the user
can describe which channels should actually be cleaned and which channels
should remain untouched or act only as sources. In the specific example script,
channel 1 is treated as the sink, while channel 0 and channel 2
are allowed to act as sources by default. If only the
channel 0 \rightarrow channel 1 relation should be modeled, channel 2
can be moved into neutral_channels.
By setting the sink-channel equal to all channels, the source-sink-N implementation has
successfully removed most of the cross-talk. This is just because we, by doing so, have allowed
the algorithm to model all possible pairwise relations. The results are very similar to the
MATLAB-N and MATLAB-3C runs above. However, the power of the source-sink-N implementation is
its ability to restrict the modeled relations to only those that are biologically relevant.
For highly multiplexed experiments, allowing all-to-all relations here may lead to
less satisfactory results and other methods such as matlab_n should be preferred.ο