NicheSphere differential communication tutorial

NicheSphere also offers the possibility to look at localized process based differential cell - cell communication based on Ligand-Receptor pairs expression data, such as results from LIANA and CrossTalkeR (Nagai et. al., 2021). This improves the interpretability of cell communication analysis as we will find changes in biological processes related to the changes in physical cell interactions. For this tutorial, we will use our self compiled database which classifies ligands according to extracellular matrix (ECM) or immune cell recruitment related processes: https://github.com/CostaLab/NicheSphere/tree/main/db

1. Libraries and functions

[1]:
import pandas as pd
import numpy as np
import scipy
import seaborn as sns
import glob
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
import networkx as nx
import warnings
import scanpy as sc
#import leidenalg
#import sklearn
#import igraph as ig
import mudata as md
from community_layout.layout_class import CommunityLayout
warnings.filterwarnings("ignore")

import nichesphere
COMMUNITY LAYOUT: Datashader not found, edge bundling not available

2. Differential co-localization niches data

One of the advantages of NicheSphere is that it allows process based localized differential communication testing by looking at changes in LR interactions involved in specific processes between niches. To do this, we will need the correspondence data between cell type pairs and niche pairs, which we got from the differential co-localization analysis. You can check the differential co-localization tutorial for more details: https://nichesphere.readthedocs.io/en/latest/tutorials.html

[2]:
pairCatDFdir=pd.read_csv('pairCatDFdir_MIvisium_louvain.csv', index_col=0)
pairCatDFdir.head()
[2]:
cell_pairs niche_pairs
0 Fib1_SCARA5->Fib1_SCARA5 1_Stromal->1_Stromal
1 Fib1_SCARA5->damaged_CM 1_Stromal->2_Stressed_CM
2 Fib1_SCARA5->Capillary_Endo 1_Stromal->2_Stressed_CM
3 Fib1_SCARA5->LYVE_FOLR_Macrophages 1_Stromal->2_Stressed_CM
4 Fib1_SCARA5->Fib3_C7 1_Stromal->1_Stromal

As well as cell types and niches

[3]:
niches_df=pd.read_csv('niches_df_MIvisium_louvain.csv', index_col=0)
niches_df.head()
[3]:
cell.1 niche color
cell
Fib1_SCARA5 Fib1_SCARA5 1_Stromal #4daf4a
damaged_CM damaged_CM 2_Stressed_CM #0072B5
Capillary_Endo Capillary_Endo 2_Stressed_CM #0072B5
LYVE_FOLR_Macrophages LYVE_FOLR_Macrophages 2_Stressed_CM #0072B5
Fib3_C7 Fib3_C7 1_Stromal #4daf4a
[4]:
niches_dict={'1_Stromal': ['Fib3_C7',
  'vCM_4',
  'Fib1_SCARA5',
  'Fib4_COL15A1',
  'Pericyte_2',
  'Pericyte_1'],
 '2_stressedCM': ['intermediate_CM',
  'damaged_CM',
  'vCM_3',
  'Mast',
  'CD_8',
  'NK_T',
  'Capillary_Endo',
  'LYVE_FOLR_Macrophages',
  'Endocardial_Endo',
  'Neuronal',
  'CD_4'],
 '3_CMimmuno': ['healthy_CM',
  'Arterial_Endo',
  'vSMCs_1',
  'Monocytes',
  'NK',
  'vSMCs_2'],
 '4_Fibrotic': ['LYVE_PLTP_Macrophages',
  'SPP1_Macrophages',
  'Venous_Endo',
  'CCL18_Macrophages',
  'Fib2_Myofib',
  'Lymphatic_Endo',
  'perivascular_fibroblasts']}

And the differential co-localization network with nodes positions, as we will look at LR interactions between differentially co-localized cell types

[5]:
gCol=nx.read_graphml("colocNW_MIvisium_louvain.graphml")
pos=np.load('colocNW_pos.npy',allow_pickle='TRUE').item()

Same cell type interactions will be excluded later on, so we’ll have a list of same cell type interaction pairs in order to subset the co-localization table we’ll generate in the next step.

[6]:
oneCTints=niches_df.index+'-'+niches_df.index

We will filter cell type pairs with non significant differential co-localization scores. Cell pairs with filter=1 will be tested

[7]:
colocFilt=pd.read_csv('colocFilt_MIvisium_louvain.csv', index_col=0)
colocFilt
[7]:
filter
cell_pairs
Fib1_SCARA5->Fib1_SCARA5 1.0
Fib1_SCARA5->damaged_CM 1.0
Fib1_SCARA5->Capillary_Endo 1.0
Fib1_SCARA5->LYVE_FOLR_Macrophages 0.0
Fib1_SCARA5->Fib3_C7 1.0
... ...
NK_T->vSMCs_2 0.0
NK_T->Lymphatic_Endo 0.0
NK_T->NK 0.0
NK_T->CD_8 0.0
NK_T->NK_T 1.0

900 rows × 1 columns

3. Database

We are interested in cell communication related to fibrosis, so we will use a database which classifies ligands according to extracellular matrix (ECM) or immune cell recruitment related processes to look for disease related processes :

[8]:
allDBs=pd.read_csv('nichesphereDB_pmid.csv', index_col=0)
allDBs.head()
[8]:
Ligand category db PMID
2 col1a1 Collagens matrisome 36399478
3 col1a2 Collagens matrisome 36399478
4 col1a1 Collagens matrisome 36399478
5 col6a3 Collagens matrisome 36399478
6 col1a2 Collagens matrisome 36399478
[9]:
sns.barplot(x=allDBs.category.value_counts(), y=allDBs.category.value_counts().index,
            data=pd.DataFrame(allDBs.category.value_counts()))
[9]:
<Axes: xlabel='count', ylabel='category'>
../_images/notebooks_Nichesphere_tutorial_MIvisium_comm_18_1.png
[ ]:

NicheSphere is suited to work with CrossTalkeR output tables which contain columns called cellpair, indicating the cell types involved in an interaction separated by ‘@’; and allpair, indicating cell types, ligand and receptor involved. Additionally, input tables for NicheSphere should contain columns indicating the ligand (gene_A), receptor (gene_B) and communication score (MeanLR). The names for these last 3 columns can be indicated in NicheSphere functions.

CrossTalkeR tables contain strings indicating wether a gene is a ligand (|L), receptor (|R) or transcription factor (|TF). These strings will be removed as they might cause conflicts with code. Then we’ll separate communication scores by process to know which cell type pairs and niche pairs are interacting through which processes in each condition.

Some interactions might be present only in one condition, so we will assign 0 values to the interactions that do not appear in one condition to be able to compare conditions. After this, we get the following tables

[10]:
myoCTpairScores_byCat2_dir_colocClusts=pd.read_csv('processed_myo_ctkr_louvainColoc.csv', index_col=0)
iscCTpairScores_byCat2_dir_colocClusts=pd.read_csv('processed_isc_ctkr_louvainColoc.csv', index_col=0)

myoCTpairScores_byCat2_dir_colocClusts.head()
[10]:
cell_pairs niche_pairs LRscores LRcat condition
allpair
Adipo/COL14A1@Arterial_Endo/CD44 Adipo->Arterial_Endo 1_Stromal->3_Healthy_CM 0.717175 Collagens myogenic
Adipo/COL14A1@CCL18_Macrophages/CD44 Adipo->CCL18_Macrophages 1_Stromal->4_Fibrotic 0.967985 Collagens myogenic
Adipo/COL14A1@CD_4/CD44 Adipo->CD_4 1_Stromal->2_Stressed_CM 0.967985 Collagens myogenic
Adipo/COL14A1@CD_8/CD44 Adipo->CD_8 1_Stromal->2_Stressed_CM 0.967985 Collagens myogenic
Adipo/COL14A1@Fib1_SCARA5/CD44 Adipo->Fib1_SCARA5 1_Stromal->1_Stromal 0.967985 Collagens myogenic

Names of cell types need to match between communication and co-localization data

4. Localized process based differential cell - cell communication

Next, we will filter by differential co-localization using the colocFilt filtering object we created before:

[11]:
myoCTpairScores_byCat2_dir_colocClusts=myoCTpairScores_byCat2_dir_colocClusts[[i in colocFilt.index[colocFilt['filter']==1] for i in myoCTpairScores_byCat2_dir_colocClusts.cell_pairs]]
iscCTpairScores_byCat2_dir_colocClusts=iscCTpairScores_byCat2_dir_colocClusts[[i in colocFilt.index[colocFilt['filter']==1] for i in iscCTpairScores_byCat2_dir_colocClusts.cell_pairs]]
[ ]:

Then we can test for differential cell communication between niche pairs or cell pairs across processes with the function diffCcommStats. We will first test per niche pair setting the parameter cellCatCol to ‘niche_pairs’, which is the column in the tables of communication scores by process containing niche pairs:

[12]:
myoIsc_diffCcommTable_colocGroups_dir=nichesphere.comm.diffCcommStats(c1CTpairScores_byCat=iscCTpairScores_byCat2_dir_colocClusts,
                                                                      c2CTpairScores_byCat=myoCTpairScores_byCat2_dir_colocClusts,
                                                                      cellCatCol='niche_pairs')
myoIsc_diffCcommTable_colocGroups_dir.head()
[12]:
wilcoxStat wilcoxPval cellCat LRcat
0 11.596600 4.287770e-31 3_Healthy_CM->2_Stressed_CM Collagens
1 7.014138 2.313703e-12 3_Healthy_CM->1_Stromal Collagens
2 14.893355 3.640346e-50 3_Healthy_CM->3_Healthy_CM Collagens
3 23.624021 2.183538e-123 4_Fibrotic->4_Fibrotic Collagens
4 4.024180 5.717429e-05 4_Fibrotic->1_Stromal Collagens

And reshape the data to plot the differential communication scores in a heatmap. Just as for the co-localization analysis, Wilcoxon rank sums tests are employed here. The diffCcommStats function returns then the table of scores per niche pair per process and the plot itself.

[13]:
hm_df, cm=nichesphere.comm.plotDiffCcommStatsHM(diffCommTable=myoIsc_diffCcommTable_colocGroups_dir,
                                                min_pval=0.05,
                                                vmin=-20, vmax=20)
../_images/notebooks_Nichesphere_tutorial_MIvisium_comm_31_0.png

Localized differential communication networks

To build process specific differential cell communication networks, we will look at differential cell communication per cell type pair per process, so we will do the differential communication test again with the diffCcommStats function, this time setting cellCatCol to ‘cell_pairs’, which is the column in the tables of communication scores by process containing cell pairs.

[14]:
myoIsc_diffCcommTable2_CTpair_dir=nichesphere.comm.diffCcommStats(c1CTpairScores_byCat=iscCTpairScores_byCat2_dir_colocClusts,
                                                                  c2CTpairScores_byCat=myoCTpairScores_byCat2_dir_colocClusts,
                                                                  cellCatCol='cell_pairs')

We must reshape the data as we will need this for the communication networks, and get the heatmap of differential communication scores per cell pair.

[15]:
x_myoIsc_dir, plothm=nichesphere.comm.plotDiffCcommStatsHM(diffCommTable=myoIsc_diffCcommTable2_CTpair_dir,
                                                           min_pval=0.05,
                                                           vmin=-5.5, vmax=5.5)
../_images/notebooks_Nichesphere_tutorial_MIvisium_comm_35_0.png

Then we will get the differential communication scores for a specific process

[16]:
x_chem=nichesphere.comm.getDiffComm(diffCommTbl=x_myoIsc_dir,
                                    pairCatDF=pairCatDFdir,
                                    ncells=33,
                                    cat='ECMglycoprots')
x_chem.head()
[16]:
Fib1_SCARA5 damaged_CM Capillary_Endo LYVE_FOLR_Macrophages Fib3_C7 healthy_CM Fib2_Myofib Endocardial_Endo Arterial_Endo Neuronal ... CCL18_Macrophages perivascular_fibroblasts CD_4 vSMCs_2 Lymphatic_Endo NK CD_8 Purkinje_fibers Adipo NK_T
Fib1_SCARA5 0.000000 2.845702 2.685997 0.000000 2.291307 2.280662 2.723508 0.000000 0.000000 -7.984900 ... 0.00000 0.0 0.000000 0.000000 0.0 0.0 0.000000 0.0 0.0 0.000000
damaged_CM 3.604778 0.000000 4.424010 4.141256 4.066038 3.376389 4.209408 4.020863 4.408857 -6.539320 ... 0.00000 0.0 4.756736 3.377464 0.0 0.0 4.701915 0.0 0.0 5.139999
Capillary_Endo 3.142450 3.364136 0.000000 0.000000 3.357803 3.326268 0.000000 2.268819 3.101101 -7.089490 ... 0.00000 0.0 3.930870 0.000000 0.0 0.0 3.887976 0.0 0.0 0.000000
LYVE_FOLR_Macrophages 0.000000 3.584589 0.000000 0.000000 0.000000 3.055622 0.000000 0.000000 0.000000 -6.539320 ... 0.00000 0.0 0.000000 0.000000 0.0 0.0 0.000000 0.0 0.0 0.000000
Fib3_C7 2.419413 3.026513 2.898491 0.000000 0.000000 2.454029 0.000000 0.000000 2.567224 -7.890426 ... 3.59563 0.0 0.000000 0.000000 0.0 0.0 0.000000 0.0 0.0 0.000000

5 rows × 33 columns

[17]:
#Niches colors
clist=['#4daf4a', '#0072B5', '#BC3C29', '#ffff33']

And now we can plot differential cell communication scores per process on the edges of the co-localization network using the catNW function from NicheSphere, whose parameters are quite similar to the ones of the colocMW function

[ ]:

[18]:
plt.rcParams['axes.facecolor'] = "None"
plt.figure(frameon=False)

G=nichesphere.comm.catNW(x_chem=x_chem,
                         colocNW=gCol,
                         cell_group=niches_dict,
                         plot_title='ECMglycoprots',
                         clist=clist,
                         nodeSize='betweeness',
                         layout=None,
                         thr=5.5,                       #only edges with weights higher than this value are shown
                         fsize=(10,10),
                         alpha=0.4,
                         edge_scale=0.5,
                         pos=pos,
                         lab_spacing=0.05)
#Legend
legend_elements1=[plt.Line2D([0], [0], marker="o" ,color='w', markerfacecolor=clist[i], lw=4,
                             label=list(niches_dict.keys())[i], ms=10) for i in range(len(list(niches_dict.keys())))]
plt.gca().add_artist(plt.legend(handles=legend_elements1,loc='upper left', fontsize=13, title='Niches',
                                alignment='left'))
#plt.savefig('ECMglycoprots_diffCommNW.pdf')
[18]:
<matplotlib.legend.Legend at 0x15550a9dbd00>
<Figure size 640x480 with 0 Axes>
../_images/notebooks_Nichesphere_tutorial_MIvisium_comm_41_2.png

As we did before, we will calculate some network statistics with the networkx package functions on the signed, directed differential communication network

[19]:
t1=pd.DataFrame({'betweenness':[nx.betweenness_centrality(G)[x] for x in list(G.nodes)],
                 #'degree':[nx.degree_centrality(G)[x] for x in list(G.nodes)],
                 'in_degree':[nx.in_degree_centrality(G)[x] for x in list(G.nodes)],
                 'out_degree':[nx.out_degree_centrality(G)[x] for x in list(G.nodes)],
                 'pagerank':[nx.pagerank(G, weight=None)[x] for x in list(G.nodes)]})
t1.index=list(G.nodes)
[20]:
fig, axes = plt.subplots(1, 4, figsize=(28, 7))
for i in range(len(t1.columns)):
    _ = sns.barplot(ax=axes[i], y=t1.sort_values(t1.columns[i], ascending=False).index[0:5], x=t1.columns[i],
        data=t1.sort_values(t1.columns[i], ascending=False)[0:5], color='purple')
    axes[i].set_title(t1.columns[i])
fig.tight_layout()
../_images/notebooks_Nichesphere_tutorial_MIvisium_comm_44_0.png

And look separately at positive and negative in and out - degree of the nodes:

[21]:
G_pos=G.copy()
to_remove=[(a,b) for a, b, attrs in G_pos.edges(data=True) if attrs["weight"] <= 0]
G_pos.remove_edges_from(to_remove)
t1=pd.DataFrame({'in_degree':[nx.in_degree_centrality(G_pos)[x] for x in list(G_pos.nodes)],
                 'out_degree':[nx.out_degree_centrality(G_pos)[x] for x in list(G_pos.nodes)]})
t1.index=list(G_pos.nodes)
[22]:
G_neg=G.copy()
to_remove=[(a,b) for a, b, attrs in G_neg.edges(data=True) if attrs["weight"] >= 0]
G_neg.remove_edges_from(to_remove)
t2=pd.DataFrame({'in_degree':[nx.in_degree_centrality(G_neg)[x] for x in list(G_neg.nodes)],
                 'out_degree':[nx.out_degree_centrality(G_neg)[x] for x in list(G_neg.nodes)]})
t2.index=list(G_neg.nodes)
[23]:
fig, axes = plt.subplots(2, 2, figsize=(14, 7))
sns.barplot(ax=axes[0][0], y=t1.sort_values('in_degree', ascending=False).index[0:5], x='in_degree',
            data=t1.sort_values('in_degree', ascending=False)[0:5], color='red')
axes[0][0].set_title('in_degree_positive')
sns.barplot(ax=axes[0][1], y=t1.sort_values('out_degree', ascending=False).index[0:5], x='out_degree',
            data=t1.sort_values('out_degree', ascending=False)[0:5], color='red')
axes[0][1].set_title('out_degree_positive')
sns.barplot(ax=axes[1][0], y=t2.sort_values('in_degree', ascending=False).index[0:5], x='in_degree',
            data=t2.sort_values('in_degree', ascending=False)[0:5], color='blue')
axes[1][0].set_title('in_degree_negative')
sns.barplot(ax=axes[1][1], y=t2.sort_values('out_degree', ascending=False).index[0:5], x='out_degree',
            data=t2.sort_values('out_degree', ascending=False)[0:5], color='blue')
axes[1][1].set_title('out_degree_negative')
fig.tight_layout()
../_images/notebooks_Nichesphere_tutorial_MIvisium_comm_48_0.png
[ ]: