NicheSphere differential co-localization tutorial

NicheSphere is an sc-verse compatible Python library which allows the user to find differentially co-localized cellular niches and biological processes involved in their interactions based on cell type pairs co-localization probabilities in different conditions. Cell type pair co-localization probabilities are obtained in different ways: from deconvoluted Visium 10x / PIC-seq data (probabilities of finding each cell type in each spot / multiplet), or counting cell boundaries overlaps for each cell type pair in single cell spatial data (MERFISH , CODEX …). This tutorial focuses on defining groups of cells that converge or split in disease (Ischemia) based on differential co-localization.

NicheSphere also offers the possibility to look at localized differential cell - cell communication based on Ligand-Receptor pairs expression data. This is addressed in the localized differential communication tutorial.

1. Libraries and functions

[1]:
import pandas as pd
import scipy
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
import networkx as nx
import warnings
import scanpy as sc
import mudata as md
import numpy as np
from community_layout.layout_class import CommunityLayout
warnings.filterwarnings("ignore")

import nichesphere
COMMUNITY LAYOUT: Datashader not found, edge bundling not available

2. Data at first glance

In this example we will use data from the Myocardial Infarction atlas from Kuppe, C. et. Al., 2022

[2]:
mudata=md.read('heart_MI_ST_SC_23samples.h5mu')
mudata
[2]:
MuData object with n_obs × n_vars = 206792 × 39120
  2 modalities
    visium: 73904 x 11704
      obs:  'n_counts', 'n_genes', 'percent.mt', 'Adipocyte', 'Cardiomyocyte', 'Endothelial', 'Fibroblast', 'Lymphoid', 'Mast', 'Myeloid', 'Neuronal', 'Pericyte', 'Cycling.cells', 'vSMCs', 'cell_type_original', 'assay_ontology_term_id', 'cell_type_ontology_term_id', 'development_stage_ontology_term_id', 'disease_ontology_term_id', 'ethnicity_ontology_term_id', 'is_primary_data', 'organism_ontology_term_id', 'sex_ontology_term_id', 'tissue_ontology_term_id', 'patient_region_id', 'orig_ident', 'batch'
      var:  'features'
      obsm: 'X_pca', 'X_spatial', 'X_umap'
    sc:     132888 x 27416
      obs:  'orig_ident', 'nCount_RNA', 'nFeature_RNA', 'percent_mt', 'doublet_score', 'doublet', 'dissociation_s1', 'opt_clust', 'patient', 'batch', 'opt_clust_integrated', 'cell_type', 'ident', 'nFeaturess_RNA', 'cell_subtype', 'cell_subtype_available', 'cell_subtype2', 'patient_region_id', 'patient_group', 'sampleType'
      var:  'n_counts'
      obsm: 'X_harmony', 'X_pca', 'X_umap_harmony'

This is a subset with 23 samples (samples with less than 1500 cells in the snRNA-seq data were filtered out), and 33 different cell subtypes

[3]:
mudata['sc'].obsm['umap']=mudata['sc'].obsm['X_umap_harmony']
sc.pl.umap(mudata['sc'],
           color=['cell_type', 'cell_subtype2'], wspace=0.3)
../_images/notebooks_Nichesphere_tutorial_MIvisium_coloc_7_0.png

In this case, we will get cell type co-localization probabilities from deconvoluted Visium data (Cell type probabilities per spot):

In a previous step, we used MOSCOT(Klein et. al., 2025) to deconvolute cell subtypes in visium slices from the same 23 samples , getting matrices of probabilities of each cell being in each spot. Then we got cell type probabilities per spot summing the probabilities of cells of the same kind in each spot; thus getting cell type probability matrices for all samples.

(you can have a closer look at these steps in the preprocessing tutorial)

[4]:
CTprops=pd.read_csv('CTprops.csv', index_col=0)
CTprops.head()
[4]:
Fib1_SCARA5 damaged_CM Capillary_Endo LYVE_FOLR_Macrophages Fib3_C7 healthy_CM Fib2_Myofib Endocardial_Endo Arterial_Endo Neuronal ... CCL18_Macrophages perivascular_fibroblasts CD_4 vSMCs_2 Lymphatic_Endo NK CD_8 Purkinje_fibers Adipo NK_T
AAACAAGTATCTCCCA-1-1-0-0-0 0.000000e+00 0.000000e+00 0.0 8.333133e-16 0.000000 0.000000e+00 0.428865 0.000000 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 NaN NaN
AAACAATCTACTAGCA-1-1-0-0-0 0.000000e+00 2.691729e-21 0.0 0.000000e+00 0.445912 5.540884e-01 0.000000 0.000000 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 NaN NaN
AAACACCAATAACTGC-1-1-0-0-0 0.000000e+00 0.000000e+00 0.0 0.000000e+00 0.000000 0.000000e+00 0.000000 0.000030 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.499508 0.0 0.0 NaN NaN
AAACAGAGCGACTCCT-1-1-0-0-0 1.373226e-25 0.000000e+00 0.0 0.000000e+00 0.499762 3.111796e-13 0.500238 0.000000 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 NaN NaN
AAACAGCTTTCAGAAG-1-1-0-0-0 0.000000e+00 0.000000e+00 0.0 0.000000e+00 0.000000 0.000000e+00 0.443230 0.113118 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 NaN NaN

5 rows × 33 columns

From these deconvolution results, we can look at cell type proportions per sample. For this we will need the spot ID and sample correspondence:

[5]:
spotSamples=mudata['visium'].obs.patient_region_id
spotSamples.reset_index().head()
[5]:
index patient_region_id
0 AAACAAGTATCTCCCA-1-0-0-0-0-0-0-0-0-0-0-0-0-0-0... control_P17
1 AAACACCAATAACTGC-1-0-0-0-0-0-0-0-0-0-0-0-0-0-0... control_P17
2 AAACAGCTTTCAGAAG-1-0-0-0-0-0-0-0-0-0-0-0-0-0-0... control_P17
3 AAACAGGGTCTATATT-1-0-0-0-0-0-0-0-0-0-0-0-0-0-0... control_P17
4 AAACCGGGTAGGTACC-1-0-0-0-0-0-0-0-0-0-0-0-0-0-0... control_P17

A way to check the deconvolution proportions is using a clustermap

[6]:
CTprops_sample=CTprops.copy()
CTprops_sample['sample']=mudata['visium'].obs.patient_region_id
sns.clustermap(CTprops_sample.groupby('sample').sum().T/CTprops_sample.groupby('sample').sum().sum(axis=1) ,
               cmap='Blues', method='ward')
[6]:
<seaborn.matrix.ClusterGrid at 0x15536a4c0590>
../_images/notebooks_Nichesphere_tutorial_MIvisium_coloc_14_1.png

Alternativelly, we can check the deconvolution proportions using barplots

[7]:
t1=pd.DataFrame(CTprops[spotSamples=='control_P7'].sum(), columns=['control_P7'])
t2=pd.DataFrame(CTprops[spotSamples=='IZ_P15'].sum(), columns=['IZ_P15'])
[8]:
sns.set(font_scale=1)
sns.set_style(rc = {'axes.facecolor': 'white'})

fig, axes = plt.subplots(1, 2, figsize=(20, 7))

sns.barplot(ax=axes[0], y=t1.sort_values('control_P7', ascending=False).index, x='control_P7',
            data=t1.sort_values('control_P7', ascending=False), color='darkblue')
axes[0].set_title('control_P7')

sns.barplot(ax=axes[1], y=t2.sort_values('IZ_P15', ascending=False).index, x='IZ_P15',
            data=t2.sort_values('IZ_P15', ascending=False), color='darkred')
axes[1].set_title('IZ_P15')
[8]:
Text(0.5, 1.0, 'IZ_P15')
../_images/notebooks_Nichesphere_tutorial_MIvisium_coloc_17_1.png

We can visualize cell type deconvolution results in slices (spots are colored by the the cell type with highest proportion). For this we will need the spatial coordinates of the spots in the Visium slices need to be in the slot uns[‘spatial’] of the Visium anndata object:

[9]:
mudata['visium'].uns['spatial']=mudata['visium'].obsm['X_spatial']
[10]:
idPat = 'GT_IZ_P9'
nichesphere.coloc.spatialCTPlot(adata=mudata['visium'][mudata['visium'].obs.patient_region_id==idPat].copy(),
                                CTprobs=CTprops.loc[spotSamples.index[spotSamples==idPat]],
                                cell_types=mudata['sc'].obs.cell_subtype2, spot_size=0.015,
                                legend_fontsize=7)
WARNING: saving figure to file figures/showtest.pdf
../_images/notebooks_Nichesphere_tutorial_MIvisium_coloc_20_1.png
[ ]:

3. Co-localization

We computed then co-localization probabilities from the cell type probability matrices. Here we got concatenated co-localization sample matrices of cell type x cell type.

Then we reshaped the co-localization data into a matrix of cell type pairs x samples.

(you can have a closer look at these steps in the preprocessing tutorial)

[11]:
colocPerSample=pd.read_csv('colocPerSample.csv', index_col=0)
colocPerSample.head()
[11]:
Fib1_SCARA5-Fib1_SCARA5 Fib1_SCARA5-damaged_CM Fib1_SCARA5-Capillary_Endo Fib1_SCARA5-LYVE_FOLR_Macrophages Fib1_SCARA5-Fib3_C7 Fib1_SCARA5-healthy_CM Fib1_SCARA5-Fib2_Myofib Fib1_SCARA5-Endocardial_Endo Fib1_SCARA5-Arterial_Endo Fib1_SCARA5-Neuronal ... NK_T-CCL18_Macrophages NK_T-perivascular_fibroblasts NK_T-CD_4 NK_T-vSMCs_2 NK_T-Lymphatic_Endo NK_T-NK NK_T-CD_8 NK_T-Purkinje_fibers NK_T-Adipo NK_T-NK_T
control_P17 0.017603 0.000308 0.000992 0.000251 0.007062 0.002586 0.004724 0.000943 0.000412 0.000351 ... 2.290066e-15 0.0 3.915381e-05 0.0 0.0 4.538656e-08 4.556003e-08 0.0 0.000000e+00 0.000268
RZ_P9 0.009307 0.000429 0.000738 0.000003 0.005204 0.001439 0.001625 0.000065 0.000168 0.000046 ... 0.000000e+00 0.0 4.640548e-05 0.0 0.0 9.954633e-05 1.643486e-05 0.0 0.000000e+00 0.000784
IZ_P15 0.030351 0.000000 0.000027 0.000186 0.001200 0.000000 0.003112 0.000072 0.000062 0.000000 ... 0.000000e+00 0.0 0.000000e+00 0.0 0.0 0.000000e+00 0.000000e+00 0.0 0.000000e+00 0.000000
RZ_P6 0.040470 0.000441 0.002752 0.000361 0.008687 0.002928 0.007878 0.000176 0.001022 0.001170 ... 0.000000e+00 0.0 7.998369e-25 0.0 0.0 8.593925e-28 0.000000e+00 0.0 0.000000e+00 0.000438
RZ_BZ_P3 0.021508 0.000292 0.000567 0.000057 0.002408 0.000483 0.006635 0.000123 0.000052 0.000052 ... 0.000000e+00 0.0 0.000000e+00 0.0 0.0 8.585311e-06 0.000000e+00 0.0 7.294563e-35 0.000897

5 rows × 1089 columns

The sum of the probabilities of every cell type pair in a sample must be = 1

[12]:
colocPerSample.sum(axis=1)
[12]:
control_P17    1.0
RZ_P9          1.0
IZ_P15         1.0
RZ_P6          1.0
RZ_BZ_P3       1.0
FZ_P14         1.0
RZ_BZ_P12      1.0
FZ_GT_P4       1.0
GT_IZ_P13      1.0
GT_IZ_P15      1.0
FZ_P20         1.0
RZ_FZ_P5       1.0
GT_IZ_P9       1.0
RZ_P3          1.0
FZ_GT_P19      1.0
FZ_P18         1.0
IZ_P10         1.0
control_P7     1.0
RZ_P11         1.0
control_P1     1.0
RZ_BZ_P2       1.0
control_P8     1.0
RZ_GT_P2       1.0
dtype: float64

Same cell type interactions will be excluded later on, so we’ll have a list of same cell type interaction pairs in order to subset the co-localization table we’ll generate in the next step.

[13]:
cell_types=CTprops.columns
oneCTints=cell_types+'-'+cell_types

Conditions

We will need the following metadata to subset the samples in control (myogenic) and disease (ischemic):

[14]:
sampleTypesDF=pd.read_csv('MI_sampleTypesDF.csv')
sampleTypesDF.head()
[14]:
Unnamed: 0 sample sampleType
0 0 control_P17 myogenic
1 1 RZ_P9 remote
2 2 IZ_P15 ischemic
3 3 RZ_P6 remote
4 4 RZ_BZ_P3 border

4. Differential co-localization analysis

We will test differential co-localization between myogenic and ischemic samples using Wilcoxon rank sums tests:

Null Hypothesis (H0): The median of the population of differences between co-localization probabilities of cell types a and b in myogenic and ischemic samples is zero.

Alternative Hypothesis (H1): The median of the population of differences between co-localization probabilities of cell types a and b in myogenic and ischemic samples is not equal to zero.

[15]:
myo_iscDF=nichesphere.coloc.diffColoc_test(coloc_pair_sample=colocPerSample,
                                           sampleTypes=sampleTypesDF,
                                           exp_condition='ischemic',
                                           ctrl_condition='myogenic')
myo_iscDF.head()
[15]:
pairs statistic p-value
pairs
Fib1_SCARA5-Fib1_SCARA5 Fib1_SCARA5-Fib1_SCARA5 0.489898 0.624206
Fib1_SCARA5-damaged_CM Fib1_SCARA5-damaged_CM -2.44949 0.014306
Fib1_SCARA5-Capillary_Endo Fib1_SCARA5-Capillary_Endo -2.204541 0.027486
Fib1_SCARA5-LYVE_FOLR_Macrophages Fib1_SCARA5-LYVE_FOLR_Macrophages -0.489898 0.624206
Fib1_SCARA5-Fib3_C7 Fib1_SCARA5-Fib3_C7 -2.44949 0.014306

Then we will reshape the data to visualize the Wilcoxon test scores in a heatmap and filter non significant co-localization differences using the parameter p (in this case, scores with p-values > 0.1 are filtered out)

[16]:
myo_isc_HM=nichesphere.tl.pval_filtered_HMdf(testDF=myo_iscDF,
                                             oneCTinteractions=oneCTints,
                                             p=0.1,                             #threshold p-value to filter
                                             cell_types=cell_types)
myo_isc_HM.head()
[16]:
Fib1_SCARA5 damaged_CM Capillary_Endo LYVE_FOLR_Macrophages Fib3_C7 healthy_CM Fib2_Myofib Endocardial_Endo Arterial_Endo Neuronal ... CCL18_Macrophages perivascular_fibroblasts CD_4 vSMCs_2 Lymphatic_Endo NK CD_8 Purkinje_fibers Adipo NK_T
Fib1_SCARA5 0.000000 -2.449490 -2.204541 0.000000 -2.449490 -2.449490 -1.959592 0.000000 0.000000 -2.449490 ... 0.000000 0.0 0.00000 0.000000 0.0 0.0 0.000000 0 0 0.000000
damaged_CM -2.449490 0.000000 -2.204541 -1.959592 -2.449490 -2.449490 -2.204541 -2.449490 -2.204541 -2.449490 ... 0.000000 0.0 -2.44949 -1.959592 0.0 0.0 -2.204541 0 0 -1.837117
Capillary_Endo -2.204541 -2.204541 0.000000 0.000000 -2.204541 -2.449490 0.000000 -2.204541 -2.449490 -2.449490 ... 0.000000 0.0 -2.44949 0.000000 0.0 0.0 -2.449490 0 0 0.000000
LYVE_FOLR_Macrophages 0.000000 -1.959592 0.000000 0.000000 0.000000 -1.959592 0.000000 0.000000 0.000000 -1.837117 ... 0.000000 0.0 0.00000 0.000000 0.0 0.0 0.000000 0 0 0.000000
Fib3_C7 -2.449490 -2.449490 -2.204541 0.000000 0.000000 -2.449490 0.000000 0.000000 -1.714643 -2.449490 ... 1.714643 0.0 0.00000 0.000000 0.0 0.0 0.000000 0 0 0.000000

5 rows × 33 columns

As the cells classified as proliferating cells (prolif) are many different cell types and thus hard to interpret, we’ll remove them for further analysis.

[17]:
myo_isc_HM=myo_isc_HM.loc[myo_isc_HM.columns.str.contains('prolif')==False,myo_isc_HM.index.str.contains('prolif')==False]

We will also remove cells with no significant co-localization differences

[18]:
myo_isc_HM=myo_isc_HM.loc[myo_isc_HM.sum()!=0,myo_isc_HM.sum()!=0]

Now we can plot the differential co-localization scores heatmap

[19]:
sns.set(font_scale=1)
plot=sns.clustermap(myo_isc_HM, cmap='vlag', center=0, method='ward', cbar_kws={'label': 'diffColoc. Score'})
../_images/notebooks_Nichesphere_tutorial_MIvisium_coloc_40_0.png

Differential co-localization network

To build the differential co-localization network, we will get an adjacency matrix (adj) based on the euclidean distances among the distributions of significant differential co-localization scores for the different cell types

[20]:
HMdist=pd.DataFrame(scipy.spatial.distance.squareform(scipy.spatial.distance.pdist(myo_isc_HM)),
                    columns=myo_isc_HM.columns, index=myo_isc_HM.index)

HMsimm=1-HMdist/HMdist.max().max()
##Cell pairs with not significant differential co-localization get 0
HMsimm[myo_isc_HM==0]=0

A cell group dictionary should be used here to visualize different cell groups in different colors. As we don’t have cell groups yet, we’ll have a dictionary of all cells in one group and a list of one color

[21]:
niches_dict={'1_': list(myo_isc_HM.index)}
clist=['#4daf4a']

Now we can plot the differential co-localization network using the colocNW function from NicheSphere. This function has many parameters that can be tuned:

nodeSize for example, defines how the size of the nodes will be calculated. Options are ‘betweeness’, ‘pagerank’ (both network statistics) and None (all nodes have the same size). alpha indicates the transparency of the edges and in goes from 0 (completely transparent) to 1 (opaque) fsize is the size of the figure (x,y)

This function returns the network with the edge weights corresponding to the diff. coloc. scores (positive and negative)

[22]:
plt.rcParams['axes.facecolor'] = "None"
nichesphere.coloc.colocNW(x_diff=myo_isc_HM,            #differential co-localization matrix
                          adj=HMsimm,                   #adjacency matrix
                          cell_group=niches_dict,
                          clist=clist,
                          nodeSize='betweeness',
                          lab_spacing=9,                #space between node and label
                          alpha=0.4,                    #edges transparency
                          fsize=(12,12))                #figure size
[22]:
<networkx.classes.graph.Graph at 0x15536a1aa190>
../_images/notebooks_Nichesphere_tutorial_MIvisium_coloc_46_1.png

Now we’ll do community detection using Louvain. First we will get the network from the adjacency matrix as we won’t use the signed weights for this

[23]:
gCol_unsigned=nx.from_pandas_adjacency(HMsimm, create_using=nx.Graph)

We will use the community-layout library function CommunityLayout to show the communities in a layout suited for this. This function is compatible with networkx (Hagberg et. al., 2008) community detection functions, which will be used internally as indicated by the parameters community_algorithm and community_kwargs

[24]:
## Calculate community layout
cl=CommunityLayout(gCol_unsigned,
        community_compression = 0.4,
        layout_algorithm = nx.spring_layout,
        layout_kwargs = {"k":75, "iterations":1000},
        community_algorithm = nx.algorithms.community.louvain_communities,
        community_kwargs = {"resolution":1.1,  'seed':12, 'weight':'weight'})
Building meta-graph
Metagraph is a Graph with 4 nodes and 6 edges
100%|██████████| 4/4 [00:00<00:00, 250.61it/s]

We can extract the communities (niches) as follows:

[25]:
d = {index: list(value) for index, value in enumerate(cl.communities())}
print(pd.DataFrame.from_dict(d, orient='index').T.to_string(index=False))
           0                     1             2                        3
 Fib1_SCARA5              Neuronal    healthy_CM perivascular_fibroblasts
  Pericyte_2       intermediate_CM Arterial_Endo        CCL18_Macrophages
       vCM_4                  CD_8       vSMCs_1           Lymphatic_Endo
  Pericyte_1                  NK_T            NK    LYVE_PLTP_Macrophages
Fib4_COL15A1        Capillary_Endo       vSMCs_2              Fib2_Myofib
     Fib3_C7      Endocardial_Endo     Monocytes         SPP1_Macrophages
        None                 vCM_3          None              Venous_Endo
        None LYVE_FOLR_Macrophages          None                     None
        None            damaged_CM          None                     None
        None                  Mast          None                     None
        None                  CD_4          None                     None

And then name them

[26]:
niche_names=['1_Stromal', '2_Stressed_CM', '3_Healthy_CM', '4_Fibrotic']
niches_dict=dict(zip(niche_names,list(d.values())))
print(pd.DataFrame.from_dict(niches_dict, orient='index').T.to_string(index=False))
   1_Stromal         2_Stressed_CM  3_Healthy_CM               4_Fibrotic
 Fib1_SCARA5              Neuronal    healthy_CM perivascular_fibroblasts
  Pericyte_2       intermediate_CM Arterial_Endo        CCL18_Macrophages
       vCM_4                  CD_8       vSMCs_1           Lymphatic_Endo
  Pericyte_1                  NK_T            NK    LYVE_PLTP_Macrophages
Fib4_COL15A1        Capillary_Endo       vSMCs_2              Fib2_Myofib
     Fib3_C7      Endocardial_Endo     Monocytes         SPP1_Macrophages
        None                 vCM_3          None              Venous_Endo
        None LYVE_FOLR_Macrophages          None                     None
        None            damaged_CM          None                     None
        None                  Mast          None                     None
        None                  CD_4          None                     None

And assign them colors to color the network nodes according to their niche

[27]:
clist=['#4daf4a', '#0072B5', '#BC3C29', '#ffff33']
niche_cols=pd.Series(clist, index=list(niches_dict.keys()))
niches_df=nichesphere.tl.cells_niche_colors(CTs=CTprops.columns,
                                            niche_colors=niche_cols,
                                            niche_dict=niches_dict)
niches_df.head()
[27]:
cell niche color
cell
Fib1_SCARA5 Fib1_SCARA5 1_Stromal #4daf4a
damaged_CM damaged_CM 2_Stressed_CM #0072B5
Capillary_Endo Capillary_Endo 2_Stressed_CM #0072B5
LYVE_FOLR_Macrophages LYVE_FOLR_Macrophages 2_Stressed_CM #0072B5
Fib3_C7 Fib3_C7 1_Stromal #4daf4a

Then we can get the node positions to input them to the NicheSphere colocNW function through the pos parameter

[28]:
pos=cl.full_positions

And plot the niches on the community layout

[29]:
plt.rcParams['axes.facecolor'] = "None"

gCol=nichesphere.coloc.colocNW(x_diff=myo_isc_HM,
                               adj=HMsimm,
                               cell_group=niches_dict,
                               clist=clist,
                               nodeSize='betweeness',
                               layout=None,                         #layout needs to be set to None if we provide node positions
                               lab_spacing=0.05,
                               thr=1,
                               alpha=0.4,
                               fsize=(10,10),
                               pos=pos,                             #node positions (from the CommunityLayout function)
                               edge_scale=1,                        #edge width
                               legend_ax=[0.7, 0.05, 0.15, 0.2])    #legend position
#Legend
legend_elements1=[plt.Line2D([0], [0], marker="o" ,color='w', markerfacecolor=clist[i], lw=4,
                             label=list(niches_dict.keys())[i], ms=10) for i in range(len(list(niches_dict.keys())))]
plt.gca().add_artist(plt.legend(handles=legend_elements1,loc='lower left', fontsize=13, title='Niches',
                                alignment='left'))
#plt.savefig('diffColocNW_CD.pdf')
[29]:
<matplotlib.legend.Legend at 0x1553693d1e90>
../_images/notebooks_Nichesphere_tutorial_MIvisium_coloc_60_1.png

We can also calculate some network statistics with the networkx package functions (this will be done on the signed network):

[30]:
t1=pd.DataFrame({'betweenness':[nx.betweenness_centrality(gCol)[x] for x in list(gCol.nodes)],
                 'degree':[nx.degree_centrality(gCol)[x] for x in list(gCol.nodes)],
                 'pagerank':[nx.pagerank(gCol, weight=None)[x] for x in list(gCol.nodes)]})
t1.index=list(gCol.nodes)
[31]:
fig, axes = plt.subplots(1, 3, figsize=(21, 7))
for i in range(len(t1.columns)):
    _ = sns.barplot(ax=axes[i], y=t1.sort_values(t1.columns[i], ascending=False).index[0:5], x=t1.columns[i],
        data=t1.sort_values(t1.columns[i], ascending=False)[0:5], color='purple')
    axes[i].set_title(t1.columns[i])
fig.tight_layout()
../_images/notebooks_Nichesphere_tutorial_MIvisium_coloc_63_0.png

So we can look separately at positive and negative degree:

[32]:
## Positive edges stats
G_pos=gCol.copy()
to_remove=[(a,b) for a, b, attrs in G_pos.edges(data=True) if attrs["weight"] <= 0]
G_pos.remove_edges_from(to_remove)

t1=pd.DataFrame({'degree':[nx.degree_centrality(G_pos)[x] for x in list(G_pos.nodes)]})
t1.index=list(G_pos.nodes)
[33]:
## Negative edges stats
G_neg=gCol.copy()
to_remove=[(a,b) for a, b, attrs in G_neg.edges(data=True) if attrs["weight"] >= 0]
G_neg.remove_edges_from(to_remove)

t2=pd.DataFrame({'degree':[nx.degree_centrality(G_neg)[x] for x in list(G_neg.nodes)]})
t2.index=list(G_neg.nodes)
[34]:
fig, axes = plt.subplots(1, 2, figsize=(14, 7))

_=sns.barplot(ax=axes[0], y=t1.sort_values('degree', ascending=False).index[0:5], x='degree',
              data=t1.sort_values('degree', ascending=False)[0:5], color='red')
axes[0].set_title('degree_positive')

_=sns.barplot(ax=axes[1], y=t2.sort_values('degree', ascending=False).index[0:5], x='degree',
              data=t2.sort_values('degree', ascending=False)[0:5], color='blue')
axes[1].set_title('degree_negative')

fig.tight_layout()
../_images/notebooks_Nichesphere_tutorial_MIvisium_coloc_67_0.png

And also visualize niches in slices (spots are colored by the niche to which the cell type with highest proportion belongs)

Niche plots

Now we will visualize the niches in the slices coloring the Visium spots according to the niche of the cell type with the highest proportion.

These are a couple myogenic slices, which will be at the top panels of the next figure:

[35]:
fig, axes = plt.subplots(2, 3, figsize=(21, 7))
plt.close(fig)
for idu,smpl in enumerate(list(sampleTypesDF['sample'][sampleTypesDF['sampleType']=='myogenic'][0:3])):
    _ = nichesphere.coloc.spatialNichePlot(adata=mudata['visium'][mudata['visium'].obs.patient_region_id==smpl].copy(),
                                       CTprobs=CTprops.loc[spotSamples.index[spotSamples==smpl]],    #dataframe of cell type probabilities per spot
                                       niche_dict=niches_dict,
                                       spot_size=0.015,
                                       niche_colors=niche_cols,                                      #series of colors with niche names as indexes
                                       legend_fontsize=7, save_name='_'+smpl+'.pdf',ax=axes[0][idu])
WARNING: saving figure to file figures/show_control_P17.pdf
<Figure size 640x480 with 0 Axes>
WARNING: saving figure to file figures/show_control_P7.pdf
<Figure size 640x480 with 0 Axes>
WARNING: saving figure to file figures/show_control_P1.pdf
<Figure size 640x480 with 0 Axes>

And a couple ischemic slices, which will be at the bottom panels of the next figure:

[36]:
for idu,smpl in enumerate(list(sampleTypesDF['sample'][sampleTypesDF['sampleType']=='ischemic'][0:3])):
    _ = nichesphere.coloc.spatialNichePlot(adata=mudata['visium'][mudata['visium'].obs.patient_region_id==smpl].copy(),
                                       CTprobs=CTprops.loc[spotSamples.index[spotSamples==smpl]],
                                       niche_dict=niches_dict,
                                       spot_size=0.015,
                                       niche_colors=niche_cols,
                                       legend_fontsize=7,
                                       save_name='_'+smpl+'.pdf',ax=axes[1][idu])
WARNING: saving figure to file figures/show_IZ_P15.pdf
<Figure size 640x480 with 0 Axes>
WARNING: saving figure to file figures/show_GT_IZ_P13.pdf
<Figure size 640x480 with 0 Axes>
WARNING: saving figure to file figures/show_GT_IZ_P15.pdf
<Figure size 640x480 with 0 Axes>
[37]:
fig.tight_layout()
fig
[37]:
../_images/notebooks_Nichesphere_tutorial_MIvisium_coloc_75_0.png

For further analysis, like differential communication: https://nichesphere.readthedocs.io/en/latest/tutorials.html

, we will need the correspondence data between cell pairs and niche pairs

[38]:
pairCatDFdir=nichesphere.tl.get_pairCatDFdir(niches_df)
pairCatDFdir.to_csv('pairCatDFdir_MIvisium_louvain.csv')
pairCatDFdir.head()
[38]:
cell_pairs niche_pairs
0 Fib1_SCARA5->Fib1_SCARA5 1_Stromal->1_Stromal
1 Fib1_SCARA5->damaged_CM 1_Stromal->2_Stressed_CM
2 Fib1_SCARA5->Capillary_Endo 1_Stromal->2_Stressed_CM
3 Fib1_SCARA5->LYVE_FOLR_Macrophages 1_Stromal->2_Stressed_CM
4 Fib1_SCARA5->Fib3_C7 1_Stromal->1_Stromal

We will also need a filtering object (colocFilt) indicating which cell pairs are differentially co-localized to filter the communication data

[39]:
## Get data of cells present in the adjacency matrix
pairCatDF_filter=[(pairCatDFdir.cell_pairs.str.split('->')[i][0] in HMsimm.index)&
                  (pairCatDFdir.cell_pairs.str.split('->')[i][1] in HMsimm.index) for i in pairCatDFdir.index]
pairCatDFdir_filt=pairCatDFdir[pairCatDF_filter]
oneCTints_filt=oneCTints[[i.split('-')[0] in HMsimm.index for i in oneCTints]]
[40]:
## Get data to flag differentially co-localized cell pairs in the adjacency matrix
colocFilt=nichesphere.tl.getColocFilter(pairCatDF=pairCatDFdir_filt,
                                        adj=HMsimm,
                                        oneCTints=oneCTints_filt.str.replace('-', '->'))
colocFilt.to_csv('colocFilt_MIvisium_louvain.csv')
colocFilt.head()
[40]:
filter
cell_pairs
Fib1_SCARA5->Fib1_SCARA5 1.0
Fib1_SCARA5->damaged_CM 1.0
Fib1_SCARA5->Capillary_Endo 1.0
Fib1_SCARA5->LYVE_FOLR_Macrophages 0.0
Fib1_SCARA5->Fib3_C7 1.0

We will need the niche - cell type - color correspondence data, the co-localization network and nodes positions for further analysis as well

[41]:
niches_df.to_csv('niches_df_MIvisium_louvain.csv')
nx.write_graphml_lxml(gCol, "colocNW_MIvisium_louvain.graphml")
np.save('colocNW_pos.npy', pos)
[ ]: