ReCIDE: Robust estimation of cell type proportions by integrating single-reference-based deconvolutions

Author: Minghan Li

Environment: R

Correspondence: If you have any questions, please ask on GitHub or contact mhli23@m.fudan.edu.cn.

Requirement:

Before using ReCIDE, please make sure the following packages are installed in your R environment. Sometimes, different versions of R packages may not cause significant issues, but it is beneficial to keep R package versions consistent.

R package	versions	R package	versions
DWLS	0.1.0	ggbiplot	0.55
Seurat	4.4.0	pbmcapply	1.5.1
fastSave	0.1.0	mclust	6.0.0
COSG	0.9.0	gmodels	2.18.1.1
stringr	1.5.0	parallel	4.3.1
dplyr	1.1.3
ggplot2	3.4.3
tidyverse	2.0.0

Usage

1. Load package

Load the required dependencies:

library(DWLS)
library(Seurat)
library(fastSave)
library(COSG)
library(stringr)
library(dplyr)
library(ggplot2)
library(ggbiplot)
library(pbmcapply)
library(mclust)
library(gmodels)

Download "ReCIDE_all_function.R" and load it:

source('/your_path/ReCIDE_all_function.R')
#if you want to use CIBERSORT as your kernel, source('your CIBERSORT path')

2. Load DEMO data:

The DEMO data is from the cross datasets test of PFC, and all data can be obtained from https://drive.google.com/drive/folders/1TRz7Xj4Ddvt3NMTOHbpiTJBkTbDuaXzf?usp=drive_link. In addition to the reference single-cell RNA-Seq matrix, other data can also be obtained from ReCIDE/DEMO_data at main · limingham/ReCIDE (github.com) .

##Load single-cell reference and pseudo-bulk datasets
SC_ref = readRDS.lbzip2('/your_path/DEMO_data/REF_PFC.rdsFS',n.cores=50)
EXP_df = readRDS('/your_path/DEMO_data/EXP_PFC_df.rds')

##Obtaining reference cell type labels and subject labels
##Please note that the cell type labels should not contain any symbols other than "_".

celltype = SC_ref@meta.data[['cell type label']] ##In the demo data, it is major.celltype.
subject = SC_ref@meta.data[['subject labels']] ##In the demo data, it is subject.

(Optional) If computational resources are insufficient (<256GB), subjects in SC_ref can be sampled.

set.seed(123)
sample_subject=sample(unique(SC_ref@meta.data[["subject"]]),10)
SC_ref=subset(SC_ref,subject %in% sample_subject)

3. Construction of reference MGMs

ref_MGM.list = MGM_build(SC_ref = SC_ref,
                        EXP_df = EXP_df,
                        label_celltype = celltype,
                        label_subject = subject,
                        n_cores = 50)

SC_ref: single-cell expression matrix (row: gene, col: cell) ，Seurat or data.frame object

EXP_df: bulk expression matrix (row: gene, col: subject) ，data.frame object

label_celltype: Celltype labels corresponding to each cell in SC_ref

label_subject: Subject labels corresponding to each cell in SC_ref

n_cores: Number of parallel cores, default 5

4. Deconvolution using each MGM

ReCIDE_results = ReCIDE_deconvolution(Sig_list = ref_MGM.list,
                                      EXP_df = EXP_df,
                                      Method = 'DWLS',
                                      n_cores = 50,
                                      n_celltype = 0)

Sig_list: list of reference MGMs output in step 3

EXP_df: bulk expression matrix (row: gene, col: subject) ，data.frame，Same as in step 3

Method: Deconvolution kernel, can be DWLS, CIBERSORT or FARDEEP, default is DWLS

n_cores: Number of parallel cores, default 5

n_celltype: Subjects in the reference set with a cell type number greater than n_celltype are used for deconvolution to prevent a situation where a cell type number is too low for a particular reference subject. The default is 0.

5. Results and Visualization

The final deconvolution results are stored in ReCIDE_results[["results_final_df"], where Rowname is the cell type and Colname is the subject name.

ReCIDE_results[["results_final_df"]]
heatmap(as.matrix(ReCIDE_results[["results_final_df"]]))

6. Comparative analysis

If you want to compare the changes in cell proportion differences between disease and normal tissues (or between disease groups), please execute the following code:

func_run_ReCIDE_postprocessing(SC_ref1,
                               SC_ref2,
                               EXP_df1,
                               EXP_df2,
                               celltype1,
                               celltype2,
                               subject1,
                               subject2,
                               dir_ref1,
                               dir_ref2,
                               dir_results1,
                               dir_results2,
                               dir_diff_celltype_output)

dir_diff_celltype_output: Cell types with proportion differences between groups, in a format similar to: /home/example.rds

TianLab-Bioinfo/ReCIDE

ErrorLooks like something went wrong!

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Error
Looks like something went wrong!

Packages