pyXenium.multimodal.ProteinMicroEnv#

class ProteinMicroEnv(adata, protein_obsm='protein', protein_norm_obsm='protein_norm', cluster_key='cluster', spatial_obsm='spatial', obs_xy=('x_centroid', 'y_centroid'), random_state=0)#

Bases: object

Microenvironment analysis for protein heterogeneity within an RNA cluster.

Parameters:

adata (AnnData)
protein_obsm (str)
protein_norm_obsm (str)
cluster_key (str)
spatial_obsm (str)
obs_xy (Tuple[str, str])
random_state (int)

__init__(adata, protein_obsm='protein', protein_norm_obsm='protein_norm', cluster_key='cluster', spatial_obsm='spatial', obs_xy=('x_centroid', 'y_centroid'), random_state=0)#

Parameters:

adata (AnnData)
protein_obsm (str)
protein_norm_obsm (str)
cluster_key (str)
spatial_obsm (str)
obs_xy (Tuple[str, str])
random_state (int)

Return type:

None

Methods

`__init__`(adata[, protein_obsm, ...])
`analyze`(cluster_id, protein[, group_key, ...])	Run the full microenvironment pipeline.
`build_kdtree_full`([radius])	Build a KDTree on all cells and decide a heuristic radius if not provided.
`build_spatial_graph`(cluster_id[, radius])	CSR adjacency within the chosen cluster; used for Moran's I.
`de_within_cluster`(cluster_id, protein_status_col)	RNA DE within a cluster between protein-high vs protein-low.
`microenv_predict_from_comp`(focus_index, ...)	Predict protein-high from neighborhood fractions; report AUC and feature coefficients.
`neighbor_composition_global`(focus_index, ...)	Compute neighborhood composition for focus_index cells against all cells (global KDTree), grouped by obs[group_key].
`neighbor_enrichment_from_comp`(focus_index, ...)	Permutation test on neighborhood fractions between protein-high vs protein-low (within focus_index).
`normalize_protein`([method, cofactor])	Store normalized protein matrix in obsm[protein_norm_obsm].
`plot_spatial`(color[, title, s, alpha, cmap])	Robust spatial scatter for numeric or categorical obs/obsm fields (with alias/status auto-fix).
`protein_moransI`(names_sub, A_sub, protein[, ...])	Moran's I of protein expression within the cluster.
`split_high_low`(cluster_id, protein[, ...])	Split within a cluster into protein_high vs protein_low by thresholding.

Attributes

`cluster_key`
`obs_xy`
`protein_norm_obsm`
`protein_obsm`
`random_state`
`spatial_obsm`
`adata`

adata: AnnData#

protein_obsm: str = 'protein'#

protein_norm_obsm: str = 'protein_norm'#

cluster_key: str = 'cluster'#

spatial_obsm: str = 'spatial'#

obs_xy: Tuple[str, str] = ('x_centroid', 'y_centroid')#

random_state: int = 0#

normalize_protein(method='clr', cofactor=5.0)#

Store normalized protein matrix in obsm[protein_norm_obsm].

Parameters:

method (str)
cofactor (float)

Return type:

None

split_high_low(cluster_id, protein, method='gmm', quantile=0.5)#

Split within a cluster into protein_high vs protein_low by thresholding. Returns (table, status_col, threshold).

Parameters:

cluster_id (str | int)
protein (str)
method (str)
quantile (float)

Return type:

Tuple[DataFrame, str, float]

build_spatial_graph(cluster_id, radius=None)#

CSR adjacency within the chosen cluster; used for Moran’s I.

Parameters:

cluster_id (str | int)
radius (float | None)

Return type:

Tuple[csr_matrix, Index, float]

build_kdtree_full(radius=None)#

Build a KDTree on all cells and decide a heuristic radius if not provided. Returns (tree, coords_all, radius_used).

Parameters:: radius (float | None)
Return type:: Tuple[cKDTree, ndarray, float]

neighbor_composition_global(focus_index, tree, coords_all, radius, group_key)#

Compute neighborhood composition for focus_index cells against all cells (global KDTree), grouped by obs[group_key].

Parameters:

focus_index (Index)
tree (cKDTree)
coords_all (ndarray)
radius (float)
group_key (str)

Return type:

DataFrame

neighbor_enrichment_from_comp(focus_index, comp, protein_status_col, permutations=999, random_state=0)#

Permutation test on neighborhood fractions between protein-high vs protein-low (within focus_index).

Parameters:

focus_index (Index)
comp (DataFrame)
protein_status_col (str)
permutations (int)
random_state (int)

Return type:

DataFrame

microenv_predict_from_comp(focus_index, comp, protein_status_col, C=1.0, max_iter=1000)#

Predict protein-high from neighborhood fractions; report AUC and feature coefficients.

Parameters:

focus_index (Index)
comp (DataFrame)
protein_status_col (str)
C (float)
max_iter (int)

Return type:

Dict[str, float | DataFrame]

protein_moransI(names_sub, A_sub, protein, use_norm=True, permutations=999)#

Moran’s I of protein expression within the cluster.

Parameters:

names_sub (Index)
A_sub (csr_matrix)
protein (str)
use_norm (bool)
permutations (int)

Return type:

Dict[str, float]

de_within_cluster(cluster_id, protein_status_col, layer='rna', method='wilcoxon', n_top_hvg=3000)#

RNA DE within a cluster between protein-high vs protein-low. Auto normalize_total + log1p (avoid raw-count warning), optional HVG selection for speed.

Parameters:

cluster_id (str | int)
protein_status_col (str)
layer (str)
method (str)
n_top_hvg (int)

Return type:

DataFrame

plot_spatial(color, title='', s=2.0, alpha=0.9, cmap='viridis')#

Robust spatial scatter for numeric or categorical obs/obsm fields (with alias/status auto-fix).

Parameters:

color (str | ndarray)
title (str)
s (float)
alpha (float)
cmap (str)

Return type:

None

analyze(cluster_id, protein, group_key='cluster', radius=None, status_method='gmm', status_quantile=0.5, de_layer='rna', permutations=999, save_dir=None)#

Run the full microenvironment pipeline.

Returns dict with keys:

status_col, threshold
radius_cluster, radius_global
neighbor_enrichment (DataFrame)
moransI (dict)
de (DataFrame)
predict_auc (float), predict_coef (DataFrame)

Parameters:

cluster_id (str | int)
protein (str)
group_key (str)
radius (float | None)
status_method (str)
status_quantile (float)
de_layer (str)
permutations (int)
save_dir (str | None)

Return type:

Dict[str, object]