pyXenium.multimodal.ProteinMicroEnv#

class ProteinMicroEnv(adata, protein_obsm='protein', protein_norm_obsm='protein_norm', cluster_key='cluster', spatial_obsm='spatial', obs_xy=('x_centroid', 'y_centroid'), random_state=0)#

Bases: object

Microenvironment analysis for protein heterogeneity within an RNA cluster.

Parameters:
__init__(adata, protein_obsm='protein', protein_norm_obsm='protein_norm', cluster_key='cluster', spatial_obsm='spatial', obs_xy=('x_centroid', 'y_centroid'), random_state=0)#
Parameters:
Return type:

None

Methods

__init__(adata[, protein_obsm, ...])

analyze(cluster_id, protein[, group_key, ...])

Run the full microenvironment pipeline.

build_kdtree_full([radius])

Build a KDTree on all cells and decide a heuristic radius if not provided.

build_spatial_graph(cluster_id[, radius])

CSR adjacency within the chosen cluster; used for Moran's I.

de_within_cluster(cluster_id, protein_status_col)

RNA DE within a cluster between protein-high vs protein-low.

microenv_predict_from_comp(focus_index, ...)

Predict protein-high from neighborhood fractions; report AUC and feature coefficients.

neighbor_composition_global(focus_index, ...)

Compute neighborhood composition for focus_index cells against all cells (global KDTree), grouped by obs[group_key].

neighbor_enrichment_from_comp(focus_index, ...)

Permutation test on neighborhood fractions between protein-high vs protein-low (within focus_index).

normalize_protein([method, cofactor])

Store normalized protein matrix in obsm[protein_norm_obsm].

plot_spatial(color[, title, s, alpha, cmap])

Robust spatial scatter for numeric or categorical obs/obsm fields (with alias/status auto-fix).

protein_moransI(names_sub, A_sub, protein[, ...])

Moran's I of protein expression within the cluster.

split_high_low(cluster_id, protein[, ...])

Split within a cluster into protein_high vs protein_low by thresholding.

Attributes

adata: AnnData#
protein_obsm: str = 'protein'#
protein_norm_obsm: str = 'protein_norm'#
cluster_key: str = 'cluster'#
spatial_obsm: str = 'spatial'#
obs_xy: Tuple[str, str] = ('x_centroid', 'y_centroid')#
random_state: int = 0#
normalize_protein(method='clr', cofactor=5.0)#

Store normalized protein matrix in obsm[protein_norm_obsm].

Parameters:
Return type:

None

split_high_low(cluster_id, protein, method='gmm', quantile=0.5)#

Split within a cluster into protein_high vs protein_low by thresholding. Returns (table, status_col, threshold).

Parameters:
Return type:

Tuple[DataFrame, str, float]

build_spatial_graph(cluster_id, radius=None)#

CSR adjacency within the chosen cluster; used for Moran’s I.

Parameters:
Return type:

Tuple[csr_matrix, Index, float]

build_kdtree_full(radius=None)#

Build a KDTree on all cells and decide a heuristic radius if not provided. Returns (tree, coords_all, radius_used).

Parameters:

radius (float | None)

Return type:

Tuple[cKDTree, ndarray, float]

neighbor_composition_global(focus_index, tree, coords_all, radius, group_key)#

Compute neighborhood composition for focus_index cells against all cells (global KDTree), grouped by obs[group_key].

Parameters:
  • focus_index (Index)

  • tree (cKDTree)

  • coords_all (ndarray)

  • radius (float)

  • group_key (str)

Return type:

DataFrame

neighbor_enrichment_from_comp(focus_index, comp, protein_status_col, permutations=999, random_state=0)#

Permutation test on neighborhood fractions between protein-high vs protein-low (within focus_index).

Parameters:
  • focus_index (Index)

  • comp (DataFrame)

  • protein_status_col (str)

  • permutations (int)

  • random_state (int)

Return type:

DataFrame

microenv_predict_from_comp(focus_index, comp, protein_status_col, C=1.0, max_iter=1000)#

Predict protein-high from neighborhood fractions; report AUC and feature coefficients.

Parameters:
  • focus_index (Index)

  • comp (DataFrame)

  • protein_status_col (str)

  • C (float)

  • max_iter (int)

Return type:

Dict[str, float | DataFrame]

protein_moransI(names_sub, A_sub, protein, use_norm=True, permutations=999)#

Moran’s I of protein expression within the cluster.

Parameters:
  • names_sub (Index)

  • A_sub (csr_matrix)

  • protein (str)

  • use_norm (bool)

  • permutations (int)

Return type:

Dict[str, float]

de_within_cluster(cluster_id, protein_status_col, layer='rna', method='wilcoxon', n_top_hvg=3000)#

RNA DE within a cluster between protein-high vs protein-low. Auto normalize_total + log1p (avoid raw-count warning), optional HVG selection for speed.

Parameters:
  • cluster_id (str | int)

  • protein_status_col (str)

  • layer (str)

  • method (str)

  • n_top_hvg (int)

Return type:

DataFrame

plot_spatial(color, title='', s=2.0, alpha=0.9, cmap='viridis')#

Robust spatial scatter for numeric or categorical obs/obsm fields (with alias/status auto-fix).

Parameters:
Return type:

None

analyze(cluster_id, protein, group_key='cluster', radius=None, status_method='gmm', status_quantile=0.5, de_layer='rna', permutations=999, save_dir=None)#

Run the full microenvironment pipeline.

Returns dict with keys:
  • status_col, threshold

  • radius_cluster, radius_global

  • neighbor_enrichment (DataFrame)

  • moransI (dict)

  • de (DataFrame)

  • predict_auc (float), predict_coef (DataFrame)

Parameters:
  • cluster_id (str | int)

  • protein (str)

  • group_key (str)

  • radius (float | None)

  • status_method (str)

  • status_quantile (float)

  • de_layer (str)

  • permutations (int)

  • save_dir (str | None)

Return type:

Dict[str, object]