This vignette guides you to prepare your Single-Cell data to use interactMIIC. This is a more personalized pipeline for more experienced bioinformaticians, another vignette more beginner-friendly is available here.

The idea of interactMIIC is to reconstruct causal graphs similar to Gene Regulatory Networks (GRNs) in your “senders” and “receivers” populations, and to link the found “intracellular mechanisms” with “intercellular interactions” in integration with Cell-Cell Communications (CCC) methods (refs).

Here a schematic network of what you can obtain with interactMIIC:

example of an interactMIIC output. We reconstruct the intracellular network in integration with the Ligands-Receptors edges found by the CCC methods

Make sure you can load the following packages :

knitr::opts_chunk$set(echo = T)

### Mandatory Libraries 
library(Seurat)
library(ggplot2)
library(tidyverse)
library(data.table)
library(dplyr)
library(igraph)

### For CCC selection
library(liana)

### For network aesthetic display
library(rjson)

#Download interactMIIC
# Sys.setenv(GITHUB_PAT = 'YOUR_GITHUB_API_TOKEN_HERE')
# devtools::install_github("interactMIIC/interactMIIC/miic_R_package", force = T) #latest version of MIIC

library(miic)

set.seed(1511)
#Set your working directory here
wd_path <- "~/multi-oMIICs/tutorial/pbmc"

# Create directories
dir.create(file.path(wd_path,"plots/"))
## Warning in dir.create(file.path(wd_path, "plots/")): 'C:\Users\Louise
## Dupuis\Documents\multi-oMIICs\tutorial\pbmc\plots' existe déjà
dir.create(file.path(wd_path,"MI_tables/"))
## Warning in dir.create(file.path(wd_path, "MI_tables/")): 'C:\Users\Louise
## Dupuis\Documents\multi-oMIICs\tutorial\pbmc\MI_tables' existe déjà
dir.create(file.path(wd_path,"MIIC_networks/"))
## Warning in dir.create(file.path(wd_path, "MIIC_networks/")): 'C:\Users\Louise
## Dupuis\Documents\multi-oMIICs\tutorial\pbmc\MIIC_networks' existe déjà

1 Initialization

First, load your Seurat object, define the senders and receivers populations, and the metadata column “interact_ident” in which you can find them.

If you have multiple subtypes of senders (or receivers), give the general population name of the senders (or receivers), and create a “subtype” metadata variable to put the more defined subpopulations names in.

Add any metadata of interest to the metadata named list. Levels should be given in an “increasing” order (this can be arbitrary and do not influence computation). Right now metadata must be categorical but this will be fixed soon to cater for continuous metadata.

### YOUR SEURAT OBJECT
load(file.path(wd_path,"preprocessed_pbmc3k.RData"))
SeuratObject <- pbmc3k

#Precise assay of interest
assay_name <- 'RNA'

#Define your senders, your receivers and the metadata where they can be found
senders <-  c('B') 
receivers <-  c('CD8 T') 
interact_ident <- "seurat_annotations" 
Idents(SeuratObject) <- interact_ident

# Specify the species genes format (mouse or human)
species <- "human"

# Specify the CCC method you want to use to select L-R edges (optional)
CCC_method <- "cellphonedb"

# If you want to remove ribomosomal genes
ribomosal_genes <- row.names(SeuratObject)[stringr::str_detect(row.names(SeuratObject), ifelse(species == "human", "RPL|RPS","Rpl|Rps") )]
counts <- GetAssayData(SeuratObject, assay = "RNA")
counts <- counts[-(which(rownames(counts) %in% ribomosal_genes)),]
SeuratObject <- subset(SeuratObject, features = rownames(counts))
rm(counts)

### YOUR BIOLOGICAL QUESTION

#genes and metadata of interest for the senders population
goi_senders <- c('HLA-DQA1') 
metadata_senders <- list()

#genes and metadata of interest for the receivers population
goi_receivers <- c('LAG3') 
metadata_receivers <- list()

# exemple:
# metadata <- list(
#   treatment = c("Control", "Treated"),
#   another_meta = c("Level1", "Level2", "Level3")
# )

2 Opt: Ligand-Receptors LIANA selection

Run this if you don’t already have a Cell-Cell communication analysis of your populations. This function will give you the top L-R links according to a method of choice, the default being cellphoneDB.

First we find relevant L-R edges between your interacting cells
#get the LIANA significant found interactions from senders to receivers
result <- interactmiic.CCClinks(SeuratObject, 
                                species = species, 
                                assay_name = assay_name,
                                interact_ident = interact_ident,
                                senders = senders, 
                                receivers = receivers,
                                CCC_method = CCC_method)

write.table(result, file = file.path(wd_path,paste0("pbmc_liana_", CCC_method, ".csv")), quote = F, sep = ",")

Extract from the CCC method output top interact edges to add the to interactMIIC network:

interact_edges <- data.frame(ligands = character(), receptors = character())

for (i in 1:nrow(result)) {
  oneligand <- result$ligand.complex[i]
  onereceptor <- result$receptor.complex[i]
  onereceptor <- unique(unlist(str_split(onereceptor, "_")))
  for (onerecp in onereceptor) {
    interact_edges[nrow(interact_edges) +1,] <- c(oneligand, onerecp)
  }
}

interact_edges <- interact_edges[!(duplicated(interact_edges)),]

#final genes lists
ligands <- unique(interact_edges$ligand)
receptors <- unique(interact_edges$receptors)

3 Opt: Genes feature selection

Run this if you don’t know which genes to put into your interactMIIC network.

Then we select the most informative genes based on these ligands/receptors and your features of itnerest (if you have a biological question in mind)

Explanation: an unsupervized way of generating feature importance in your dataset is to calculate the Mutual Information between all pairs of variables. Variables that share a lot of information with a lot of other variables are necessary to understand your dataset dynamics (these signals could be artefacts or true interactions, both needs to be discovered to analyze your dataset). Of course these mutual information could become 0 after conditioning on a separating set but this will be computed later when we construct the network with MIIC. Here it serves the purpose of a preliminary analysis of our dataset.

How to: As computing the pairwise mutual information of all the variables is too long (25000*25000), we select ~5 genes and ~5 metadata as features of interest and compute their shared information with all the others variables (10*25000)

#Find genes that share the most information with your biological question :
MI_sender_genes <- interactmiic.MIselection(seurat_object = SeuratObject,
                                 assay_name = "RNA",
                                 interact_ident = interact_ident,
                                 oneinteract = senders,
                                 goi = unique(c(goi_receivers,ligands[1:min(length(ligands),15)])),
                                 metadata_list = names(metadata_senders),
                                 save = T,
                                 output_dir = file.path(wd_path, "MI_tables"),
                                 plot = T)

MI_receivers_genes <- interactmiic.MIselection(seurat_object = SeuratObject,
                                 assay_name = "RNA",
                                 interact_ident = interact_ident,
                                 oneinteract = receivers,
                                 goi = unique(c(goi_receivers,receptors[1:min(length(receptors),15)])),
                                 metadata_list = names(metadata_receivers),
                                 save = T,
                                 output_dir = file.path(wd_path, "MI_tables"),
                                 plot = T)

4 interactMIIC

Create the necessary interactMIIC files to reconstruct a network.

FInally we automatically format all necessary data tables for interactMIIC

4.1 Create final variables list

This chunk merges the variables of interest and the ligands/receptors of each celltype (senders or receivers) of the CCC analysis to look into during interactMIIC network reconstruction

#SENDER
genes_senders <- unique(c(ligands, MI_sender_genes, goi_senders))
message(paste(c("Here are the genes used to reconstruct the senders MIIC network:",paste0(genes_senders, collapse = ", ")), collapse = "\n"))
## Here are the genes used to reconstruct the senders MIIC network:
## B2M, HLA-C, HLA-A, HLA-B, HLA-DRA, HLA-DPB1, HLA-DRB1, HLA-DPA1, HLA-E, HMGB1, HLA-DQA1, HLA-DQB1, HLA-DRB5, MIF, CD48, HLA-F, CD22, HLA-DQA2, LGALS9, ICAM2, GNAS, CD74, EEF1A1, GAPDH, COTL1, PTMA, MALAT1, ACTG1, TPT1, FAU, ARPC2, MT-CO1, OAZ1, NACA, SEC61B, LTB, TMSB10, UBC, ACTB, CYBA, UBA52, LDHB, GNB2L1, NPM1, PPIB, EEF2, FTL, MYL12B, NDUFA13, FTH1, TMSB4X, PFDN5, GLTSCR2, KLF6, MYL12A, BTF3, COX5A, RAC2, VIM, ARPC1B, EEF1B2, LSP1, MT-CO3, H3F3B, CFL1, PFN1, PSMB9, GPR183, LY6E, CD79A, ATP5G2, ATP5L, TCL1A, DYNLL1, SLC25A6, SRGN, EIF1, FIS1, GABARAPL2, SAT1, ANXA2, SERF2, CNBP, ARPC5L, C1orf43, CD52, TMEM258, PSMB1
if (length(metadata_senders) >0) {
  message(paste(c("Here are the metadata used to reconstruct the senders MIIC network:",paste0(c(names(metadata_senders)), collapse = ", ")), collapse = "\n"))
}

#RECEIVER
genes_receivers <- unique(c(receptors, MI_receivers_genes, goi_receivers ))
message(paste(c("Here are the genes used to reconstruct the receivers MIIC network:",paste0(genes_receivers, collapse = ", ")), collapse = "\n"))
## Here are the genes used to reconstruct the receivers MIIC network:
## CD3D, LAG3, CD8A, CD8B, CXCR4, CD74, CD2, CD44, PTPRC, ITGAL, ITGB2, PTGDR, FOS, DUSP1, JUNB, JUN, ZFP36, GZMH, TTC16, FCGR3A, KBTBD3, SP140L, CROCCP2, FGFBP2, HLA-DRB5, IER2, GRSF1, ZFP36L2, LTB, NKG7, LZTR1, FKBP1B, SPIN4, HIPK1, VEZF1, PCNA, HLA-DPB1, RP5-882C2.2, ZNF598, S1PR5, ZCCHC2, BTG1, AQP3, IL7R, MZT2A, PDP1, POM121, C12orf75, HCST, ZNF747, CD82, ASCL2, HLA-DRB1, HLA-DRA, ADAM9, SAMM50, CST7, OAZ1, CD83, VPS26A, FAM32A, DDRGK1, HLA-DQA1, ZNF175, C7orf73, SNX10, VEZT, TBC1D19, CUL9, ZFAND2B
if (length(metadata_receivers) >0) {
  message(paste(c("Here are the metadata used to reconstruct the receivers MIIC network:",paste0(c(names(metadata_receivers)), collapse = ", ")), collapse = "\n"))
}

4.2 Create the input mosaic matrix

Generate the matrix of senders and receivers cells with selected variables in the interactMIIC format:

interactMIIC_df <- interactmiic.mosaic(seurat_object = SeuratObject,
                                assay_name = assay_name,
                                interact_ident = interact_ident,
                                senders_name = senders,
                                receivers_name = receivers,
                                genes_senders= genes_senders,
                                genes_receivers = genes_receivers,
                                metadata_senders = names(metadata_senders),
                                metadata_receivers = names(metadata_receivers))

4.3 Create the state order

Then, generate the MIIC state_order. The state order is an optional file that allows you to input optional information for the computation (such as contextual variables, types of variables (otherwise detected automatically)… ) and information for the display (groups of nodes, levels ordering of categorical data …). Here is a brief description:

  • “var_type” column
    • 0 is categorical/discrete and 1 is continuous
    • we advise to put a continuous variable with less than 5 different values as discrete
  • “levels_increasing_order” column
    • if the variable is categorical and the categories can be ordered (e.g. “Non-treated”, “Treated”), you can give the order of levels
  • “is_contextual” column
    • 0 is non-contextual and 1 if contextual (meaning that no variable can cause this one)
    • prior knowledge e.g. experimental variable
  • “group” and “group_color” columns
    • These are to assign variables in different groups to color-code them on the MIIC network (webserver display only)
    • Note that this is aesthetic only and do not modify the MIIC algortihm

Here, to help the readability of the network, we are going to create 5 groups : metadata,sender_genes,ligands, receptors, receivers_genes.

interactmiic.state_order2 <- function(mosaic_data_table,
                                     metadata_senders,
                                     metadata_receivers,
                                     genes_senders,
                                     ligands,
                                     receptors,
                                     genes_receivers) {

  # Input checks
  if (!is.data.frame(mosaic_data_table)) {
    stop("mosaic_data_table must be a data frame.")
  }
  if (!is.list(metadata_senders)) {
    stop("metadata_senders is not a named list")
  }
  if (!is.list(metadata_receivers)) {
    stop("metadata_receivers is not a named list")
  }
  if (!is.vector(genes_senders) || !is.character(genes_senders)) {
    stop("genes_senders must be a character vector.")
  }
  if (!is.vector(genes_receivers) || !is.character(genes_receivers)) {
    stop("genes_receivers must be a character vector.")
  }
  if (!is.vector(ligands) || !is.character(ligands)) {
    stop("ligands must be a character vector.")
  }
  if (!is.vector(receptors) || !is.character(receptors)) {
    stop("receptors must be a character vector.")
  }

  # Initialize state_order data frame
  state_order <- data.frame(var_names = character(),
                            var_type = numeric(),
                            levels_increasing_order = character(),
                            is_contextual = numeric(),
                            group = character(),
                            group_color = character())

  # Add metadata information
  duplicated_meta <- intersect(names(metadata_senders), names(metadata_receivers))
  if (length(metadata_senders) >0) {
    for (onemeta in names(metadata_senders)) {
        levels_increasing_order <- paste(metadata_senders[[onemeta]], collapse = ",")
        if (ifelse(onemeta %in% duplicated_meta, paste0(onemeta, "_senders"), onemeta) %in% colnames(mosaic_data_table)) {
          state_order[nrow(state_order) + 1,] <- list(ifelse(onemeta %in% duplicated_meta, paste0(onemeta, "_senders"), onemeta), 0, levels_increasing_order, 0, "metadata", "FFD050")
        }
      }
  }

  if (length(metadata_receivers) >0) {
    for (onemeta in names(metadata_receivers)) {
        levels_increasing_order <- paste(metadata_receivers[[onemeta]], collapse = ",")
        if (ifelse(onemeta %in% duplicated_meta, paste0(onemeta, "_receivers"), onemeta) %in% colnames(mosaic_data_table)) {
          state_order[nrow(state_order) + 1,] <- list(ifelse(onemeta %in% duplicated_meta, paste0(onemeta, "_receivers"), onemeta), 0, levels_increasing_order, 0, "metadata", "FFD050")
        }
      }
  }

  # Identify duplicated genes (special case)
  dups <- c(paste0(intersect(genes_senders, genes_receivers), "_senders"),paste0(intersect(genes_senders, genes_receivers), "_receivers"))

  interact_meta <- as.vector(state_order$var_names)
  
  # Assign the genes
  for(col in setdiff(colnames(mosaic_data_table), interact_meta)) {

    # Initialize variables
    group <- ""
    color <- ""
    
    # Test if gene is duplicated and assign group
    if (col %in% dups) {
      gene <- stringr::str_split(col, "_")[[1]][1]
      family <- stringr::str_split(col, "_")[[1]][2]
    } else {
      gene <- col
      family <- ifelse(gene %in% genes_senders, "senders", "receivers")
    }

    if (family == "senders") {
        group <- "senders genes"
        color <- "ABC4AB"
        if (gene %in% ligands) {
          group <- "ligands"
          color <- "40A86A"
        }
      } else {
        group <- "receivers genes"
        color <- "F4A289"
        if (gene %in% receptors) {
          group <- "receptors"
        color <- "F26841"
        }
      }

    # Add to state_order
    if (unique(dplyr::n_distinct(mosaic_data_table[,col]) <= 5)) {
      state_order[nrow(state_order) + 1,] <- list(col, 0, "", 0, group, color)
    } else {
      state_order[nrow(state_order) + 1,] <- list(col, 1, "", 0, group, color)
    }
  }
  return(state_order)
}
interactMIIC_st <- interactmiic.state_order2(mosaic_data_table = interactMIIC_df,
                                genes_senders= genes_senders,
                                genes_receivers = genes_receivers,
                                ligands = ligands,
                                receptors = receptors,
                                metadata_senders = metadata_senders,
                                metadata_receivers = metadata_receivers)

4.4 Create the network layout (do not use)

If you use the webserver, this creates a nice display of your network :

### WORK IN PROGRESS - DO NOT CALL

# Generate empty layout json file
network_layout <- interactmiic.layout(interactMIIC_st, square_height = 8)
file <- file(file.path(wd_path,"MIIC_networks/pbmc_v1_layout.json"))
writeLines(network_layout, file)
close(file) # to absolutely not forget

4.5 Save files

write.table(interactMIIC_df, file = file.path(wd_path,"MIIC_networks/pbmc_v1.csv"), quote = F, sep = ",", row.names = F)

write.table(interactMIIC_st, file = file.path(wd_path, "MIIC_networks/pbmc_v1_st.tsv"), quote = F, sep = "\t", row.names = F)

duplicated_ligands <- intersect(ligands,genes_receivers)
duplicated_receptors <- intersect(receptors, genes_senders)

interact_edges$ligands <- ifelse(interact_edges$ligands %in% duplicated_ligands, paste0(interact_edges$ligands, "_senders"), interact_edges$ligands)
interact_edges$receptors <- ifelse(interact_edges$receptors %in% duplicated_receptors, paste0(interact_edges$receptors, "_receivers"), interact_edges$receptors)

write.table(interact_edges, file.path(wd_path,"MIIC_networks/pbmc_v1_interactEdges.tsv"), row.names=F, quote=F, sep="\t")

5 Run interactMIIC

Now we have 2 options: we can use interactMIIC in R studio but the computation time can be long depending on your machine.

We recommend putting the consistent option to “skeleton” meaning interactMIIC will take an extra step to make sure it is not removing necessary links to explain indirect pathways and give a consistent skeleton of the graph.

miic.res <- miic(input_data = interactMIIC_df, #input dataset
                 state_order = interactMIIC_st, #additional information
                 interact_edges = interact_edges, #CCC links to add
                 consistent = "skeleton",
                 conf_threshold = 0.1, #threshold to filter the less probable edges in skeleton step
                 n_shuffles=100,
                 ori_proba_ratio = 0.1) #remove orientations that do not meet this threshold
## [1] "the interactEdges file is detected in miic.R"
## [1] "the optional weight column in interactEdges file is not a problem in miic.reconstruct.R"
## Search all pairs for unconditional independence relations...
## Did not remove 0---70edge in initializeEdges() because status is 1
## Did not remove 0---71edge in initializeEdges() because status is 1
## Did not remove 0---72edge in initializeEdges() because status is 1
## Did not remove 0---73edge in initializeEdges() because status is 1
## Did not remove 1---74edge in initializeEdges() because status is 1
## Did not remove 1---75edge in initializeEdges() because status is 1
## Did not remove 1---76edge in initializeEdges() because status is 1
## Did not remove 1---77edge in initializeEdges() because status is 1
## Did not remove 1---80edge in initializeEdges() because status is 1
## Did not remove 1---81edge in initializeEdges() because status is 1
## Did not remove 1---82edge in initializeEdges() because status is 1
## Did not remove 1---87edge in initializeEdges() because status is 1
## Did not remove 2---71edge in initializeEdges() because status is 1
## Did not remove 2---72edge in initializeEdges() because status is 1
## Did not remove 2---73edge in initializeEdges() because status is 1
## Did not remove 2---78edge in initializeEdges() because status is 1
## Did not remove 2---85edge in initializeEdges() because status is 1
## Did not remove 3---71edge in initializeEdges() because status is 1
## Did not remove 3---72edge in initializeEdges() because status is 1
## Did not remove 3---73edge in initializeEdges() because status is 1
## Did not remove 3---78edge in initializeEdges() because status is 1
## Did not remove 3---85edge in initializeEdges() because status is 1
## Did not remove 4---79edge in initializeEdges() because status is 1
## Did not remove 4---83edge in initializeEdges() because status is 1
## Did not remove 5---83edge in initializeEdges() because status is 1
## Did not remove 6---84edge in initializeEdges() because status is 1
## Did not remove 7---83edge in initializeEdges() because status is 1
## Did not remove 8---86edge in initializeEdges() because status is 1
## Did not remove 8---88edge in initializeEdges() because status is 1
## Did not remove 9---89edge in initializeEdges() because status is 1
## Did not remove 10---89edge in initializeEdges() because status is 1
## Did not remove 11---90edge in initializeEdges() because status is 1
## Search for candidate separating nodes...
## Search for conditional independence relations...
## Compute confidence cut with permutations...95 edges cut.
## Search for edge directions...
## Iteration 0 Number of edges: 281
## Search for candidate separating nodes...
## Search for conditional independence relations...
## Compute confidence cut with permutations...101 edges cut.
## Search for edge directions...
## Iteration 1 Number of edges: 281
## cycle found of size 1
plot(miic.res)

This other possibility is to run the job on the MIIC webserver.

On the MIIC webserver you can’t specify interacting edges (yet), so the result do not show ligands-receptors links. But here is a way to add them to the vizualisation:

First you need to download the MIIC results from the webserver : edgesList.miic.summary.txt and add it to your working directory folder.

miic.summary <- read.csv(file.path(wd_path, "MIIC_results/edgesList.miic.summary.txt"), sep = "\t")
state_order <- read.csv(file.path(wd_path,"MIIC_networks/pbmc_v1_st.tsv"), sep = "\t")
interactEdges <- read.csv(file.path(wd_path,"MIIC_networks/pbmc_v1_interactEdges.tsv"), sep = "\t")
max_info <- max(miic.summary$info_shifted)

interact.summary <- setNames(data.frame(matrix(ncol = ncol(miic.summary), nrow = 0)), colnames(miic.summary))
for (i in 1:nrow(interactEdges)) {
  oneligand = interactEdges$ligand[i]
  onereceptor = interactEdges$receptor[i]
  onetype = "P"
  oneinfo_shifted = max_info
  onepartial = 0
  interact.summary[nrow(interact.summary) +1,] <- list(oneligand, onereceptor, onetype, NA, oneinfo_shifted, NA, NA, NA, oneinfo_shifted, 2, NA, NA, NA, onepartial,"Y","1;1",NA)
}

total.summary <- rbind(miic.summary, interact.summary)
write.table(total.summary, file.path(wd_path,"MIIC_results/edgesList.interactmiic.totalsummary.txt"), quote = F, row.names = F, sep = '\t')

Now you can upload this updated summary to the visualisation tool ( interactMIIC network with CCC links ) :

Attention renommer “labelSize” en “nodeLabelSize” si on veut mettre un layout sur vis_NL.php car il y aune différence de version …

interactMIIC