Breast WTA HistoSeg + LazySlide image features#

Overview#

This tutorial documents the upgraded RNA + HistoSeg structure + H&E image workflow for the Atera breast WTA Xenium sample. HistoSeg provides tissue structure contours, LazySlide provides WSI tile embeddings and vision-language prompt-similarity scores when the selected model supports text embeddings, and pyXenium.multimodal aggregates those outputs into structure-level image/RNA association tables.

The pyXenium side is intentionally a thin integration layer. It does not run HistoSeg segmentation and it does not vendor LazySlide image-model code.

Biological question#

For each HistoSeg structure in the breast WTA sample:

Which foundation-model image features distinguish this structure from the other structures?
Which PLIP/CONCH/OmiCLIP prompt terms are enriched in the tiles assigned to the structure?
Do structure-level H&E features align with Xenium RNA programs and boundary hypotheses?

Workflow boundary#

Package	Responsibility
HistoSeg	Structure segmentation, contour/ROI GeoJSON, mask QC
LazySlide	WSI opening, H&E tiling, pathology foundation model feature extraction, optional vision-language prompt scoring, spatial tile domains
pyXenium	Xenium/H&E alignment, tile-to-structure assignment, structure aggregation, RNA/image association

Python API#

from pyXenium.multimodal import run_histoseg_lazyslide_structure_workflow

result = run_histoseg_lazyslide_structure_workflow(
    "/path/to/WTA_Preview_FFPE_Breast_Cancer_outs",
    contour_geojson="/path/to/xenium_explorer_annotations.s1_s5.generated.geojson",
    contour_key="histoseg_structures",
    output_dir="/path/to/a100/run",
    he_source_path="/path/to/WTA_Preview_FFPE_Breast_Cancer_he_image.tiffslide_pyramid.tif",
    wsi_reader="tiffslide",
    model="plip",
    text_model="plip",
    tile_px=224,
    mpp=0.5,
    device="cuda",
    batch_size=64,
    table_format="parquet",
)

model is the LazySlide image/foundation model used for tile embeddings. It can be any model supported by the local LazySlide installation, for example plip, conch, uni, uni2, gigapath, virchow, or related model-zoo entries. text_model is separate and is only used for tile-level prompt scoring. It must share the same image-text latent space as the image model. In practice, text_model="plip" is valid with model="plip" and text_model="conch" is valid with model="conch". Vision-only encoders such as UNI, Virchow, or GigaPath produce embeddings and spatial domains but do not automatically assign pathology names. Use text_model="none" to disable prompt scoring explicitly.

The workflow writes:

image_contours.parquet
tile_features.parquet
tile_assignments.parquet
structure_image_features.parquet
structure_differential_features.parquet
structure_rna_summary.parquet
structure_program_scores.parquet
contour_multimodal_summary.parquet
contour_image_molecular_associations.parquet
wta_pathway_partial_correlations.parquet
molecular_prediction_benchmark.parquet
morphomolecular_hero_targets.parquet
morphomolecular_hero_contours.parquet
rna_image_associations.parquet
program_image_associations.parquet
run_manifest.json

A100 run#

Use the A100 runner in benchmarking/lazyslide_a100/:

export PYXENIUM_REPO=/path/to/pyXenium
export A100_ENV_DIR=/path/to/envs/pyxenium-lazyslide
bash benchmarking/lazyslide_a100/scripts/bootstrap_a100_env.sh

export PYXENIUM_ATERA_DATASET=/path/to/WTA_Preview_FFPE_Breast_Cancer_outs
export HISTOSEG_GEOJSON=/path/to/xenium_explorer_annotations.s1_s5.generated.geojson
export A100_OUTPUT_DIR=/path/to/runs/histoseg_lazyslide_breast_wta_plip
export LAZYSLIDE_MODEL=plip
export LAZYSLIDE_TEXT_MODEL=plip

bash benchmarking/lazyslide_a100/scripts/run_a100_histoseg_lazyslide.sh \
  --max-tiles 2000

WSI preparation#

The original Atera breast WTA OME-TIFF stores RGB as planar SYX. tifffile can see the internal levels, but WSIData/open_wsi defaults to OpenSlide and only recognizes one level for this file layout. That made the first direct WSI attempt behave like a one-level 17 GB image.

The fix is to rewrite the H&E image once into a tiffslide-readable tiled pyramidal BigTIFF with interleaved YXS RGB:

PYTHONPATH=src \
/data/taobo.hu/pyxenium_lazyslide_breast_wta_20260507/envs/plip-patch/bin/python \
  benchmarking/lazyslide_a100/scripts/prepare_tiffslide_pyramid.py \
  --input /data/taobo.hu/pyxenium_lazyslide_breast_wta_20260507/data/WTA_Preview_FFPE_Breast_Cancer_he_image.ome.tif \
  --output /data/taobo.hu/pyxenium_lazyslide_breast_wta_20260507/data/WTA_Preview_FFPE_Breast_Cancer_he_image.tiffslide_pyramid.tif \
  --tile-px 512 \
  --jpeg-quality 90 \
  --verify

The prepared WSI validates as 10 levels with MPP 0.2738 and is recorded in prepared_wsi_manifest.json.

The full direct LazySlide PLIP command used for this RTD snapshot was:

CUDA_VISIBLE_DEVICES=7 \
/data/taobo.hu/pyxenium_lazyslide_breast_wta_20260507/envs/plip-patch/bin/python \
  benchmarking/lazyslide_a100/scripts/run_histoseg_lazyslide_workflow.py \
  --dataset-root /data/taobo.hu/pyxenium_lr_benchmark_2026-04/data/source_cache/breast/WTA_Preview_FFPE_Breast_Cancer_outs/spatialdata.zarr \
  --histoseg-geojson /data/taobo.hu/pyxenium_lazyslide_breast_wta_20260507/data/xenium_explorer_annotations.s1_s5.generated.geojson \
  --contour-id-key name \
  --he-source-path /data/taobo.hu/pyxenium_lazyslide_breast_wta_20260507/data/WTA_Preview_FFPE_Breast_Cancer_he_image.tiffslide_pyramid.tif \
  --wsi-reader tiffslide \
  --output-dir /data/taobo.hu/pyxenium_lazyslide_breast_wta_20260507/runs/direct_lazyslide_plip_full_text \
  --model plip \
  --text-model plip \
  --batch-size 64 \
  --table-format parquet

A100 PLIP result snapshot#

The committed RTD artifacts in this page come from a completed direct WSI LazySlide run on GPU 7 using PLIP.

Field	Value
Workflow	`histoseg_lazyslide_structure_workflow`
WSI reader	`tiffslide`
LazySlide	`0.10.1`
GPU	NVIDIA A100-SXM4-40GB
Torch	`2.6.0+cu124`
Embedding model	`plip`
Prompt scoring model	`plip`
HistoSeg contours	1,578
LazySlide tiles	3,115
Assigned tiles	3,114
HistoSeg structures	5
Embedding dimensions	512
Runtime	3,989.3 seconds

The PLIP prompt terms used for zero-shot image-text scoring were:

ductal epithelium
invasive carcinoma
in situ carcinoma
fibrotic stroma
immune infiltrate
necrosis
adipose tissue
vascular stroma
lumen or secretion

These terms are a manually curated breast histology prompt set (breast_histology_v1). They are not HistoSeg structure names and they are not pathologist-confirmed diagnostic labels.

Structure-level prompt scores#

HistoSeg structure	Tiles	Top mean PLIP prompt	Top tile prompt mode	Enriched prompt-similarity terms
S1	866	invasive carcinoma	immune infiltrate	invasive carcinoma, immune infiltrate, in situ carcinoma
S2	140	immune infiltrate	immune infiltrate	immune infiltrate, necrosis, invasive carcinoma
S3	797	necrosis	necrosis	adipose tissue, vascular stroma, fibrotic stroma
S4	1,184	necrosis	fibrotic stroma	fibrotic stroma, vascular stroma, adipose tissue
S5	127	immune infiltrate	immune infiltrate	invasive carcinoma, immune infiltrate, in situ carcinoma

The enriched terms are one-vs-rest positive PLIP prompt-similarity features with FDR < 0.05 when available. They should be interpreted as image-language features, not as diagnostic labels. The score range is narrow, so the most useful signal is the structure-to-structure contrast rather than the absolute name of a single top prompt.

Contour-scale WTA morphomolecular programs#

The WTA downstream analysis reuses the completed direct PLIP tile run and scores breast/TME gene programs from the Xenium whole-transcriptome matrix at HistoSeg-contour scale. The key test is no longer whether H&E beats structure labels in cross-validated prediction. In this single sample, HistoSeg labels explain much of the coarse molecular variance. The stronger result is residual: after controlling for HistoSeg structure, contour centroid, and cell-to-boundary distance, prompt-independent PLIP embedding axes remain associated with WTA program activity.

Top residual WTA program associations:

Rank	WTA program	Best H&E axis	Partial Spearman rho	FDR
1	luminal_estrogen_response	`embedding__103__mean`	-0.597	8.64e-25
2	epithelial_identity	`embedding__103__mean`	-0.496	1.10e-16
3	basal_squamous_state	`embedding__480__mean`	0.492	1.89e-16
4	unfolded_protein_response	`embedding__29__mean`	0.491	2.30e-16
5	oxidative_phosphorylation	`embedding__371__mean`	0.485	6.58e-16
6	p53_apoptosis_stress	`embedding__246__mean`	-0.439	8.12e-13
7	tls_b_cell_plasma	`embedding__20__mean`	-0.409	4.57e-11
8	cell_cycle_proliferation	`embedding__342__mean`	0.398	1.97e-10
9	her2_amplicon_signaling	`embedding__426__mean`	0.389	5.73e-10
10	t_cell_exhaustion_checkpoint	`embedding__115__mean`	-0.377	2.52e-09

The complete table is available as wta_pathway_partial_correlations.csv. The hero contour tables are available as morphomolecular_hero_targets.csv and morphomolecular_hero_contours.csv.

WTA morphomolecular hero contour montage

Visual outputs#

PLIP tile embedding UMAP

Structure-level PLIP text similarity heatmap

Spatial tile map

Representative tile montage

Artifact files#

The copied A100 snapshot is stored in:

docs/_static/tutorials/multimodal_histoseg_lazyslide_breast_wta/

Key files are:

Interpretation rules#

structure_image_features is the main table for asking what image signatures each HistoSeg structure carries.
structure_differential_features ranks one-vs-rest image features per structure.
The current RTD snapshot reports direct LazySlide WSI tiling, PLIP image embeddings, PLIP prompt-similarity scores, structure-level RNA summaries, contour-level WTA program scores, partial morphomolecular associations, and program/image associations.
In the Breast WTA single-sample snapshot, structure_only remains the strongest cross-validated baseline for many epithelial markers and programs. Treat the WTA result as evidence for within-structure residual molecular decoding, not as a claim that H&E embeddings improve every molecular prediction benchmark beyond HistoSeg labels.
PLIP is the first required A100 result. CONCH, OmiCLIP, UNI, Virchow, GigaPath, and other LazySlide foundation models can be run through the same model entry point when the model files, licenses, and credentials are available. Vision-only models contribute embeddings; vision-language models can also contribute prompt scores.

Current implementation status#

The pyXenium API, optional dependency boundary, A100 runner, WSI preparation script, direct LazySlide WSI backend, artifact schema, and full A100 PLIP image-feature snapshot are implemented. The earlier patch-corpus fallback remains useful for quick checks, but this page now reports the direct WSI LazySlide result.