Skip to content

CELLECT MAGMA Tutorial

JonThom edited this page Aug 24, 2020 · 5 revisions

This tutorial will take you through running CELLECT-MAGMA on two GWAS summary stats and two example expression specificity inputs.

Tl;dr

cd ~/CELLECT
# ---------------------- STEP 1: prep GWAS data ----------------- #
wget https://portals.broadinstitute.org/collaboration/giant/images/c/c8/Meta-analysis_Locke_et_al%2BUKBiobank_2018_UPDATED.txt.gz -P example/
wget https://www.dropbox.com/s/ho58e9jmytmpaf8/GWAS_EA_excl23andMe.txt -P example/

conda env create -f ldsc/environment_munge_ldsc.yml
conda activate munge_ldsc

python ldsc/mtag_munge.py \
--sumstats example/GWAS_EA_excl23andMe.txt \
--merge-alleles data/ldsc/w_hm3.snplist \
--n-value 766345 \
--keep-pval \
--p PVAL \
--out example/EA3_Lee2018

python ldsc/mtag_munge.py \
--sumstats example/Meta-analysis_Locke_et_al+UKBiobank_2018_UPDATED.txt.gz \
--a1 Tested_Allele \
--a2 Other_Allele \
--keep-pval \
--p PVAL \
--merge-alleles data/ldsc/w_hm3.snplist \
--out example/BMI_Yengo2018

# --- STEP 2: Generate cell-type specificity input using CELLEX --- #
(CELLEX specificity files have been pre-generated for this tutorial)

# ---------------------- STEP 3: run CELLECT-MAGMA ----------------- #
conda activate <env_with_snakemake>

snakemake --use-conda -j -s cellect-magma.snakefile --configfile config.yml

Step 0: Create conda environment for munging

Analogical to the Step 0 of CELLECT LDSC Tutorial.

Step 1: Download and munge GWAS

See the Step 1 of CELLECT LDSC Tutorial.

Step 2: Generate cell-type specificity input using CELLEX

See the Step 2 of CELLECT LDSC Tutorial.

Step 3: Run CELLECT-MAGMA

Now we will run the workflow. Remember, running CELLECT-MAGMA requires having the snakemake library available (e.g. activate an environment with snakemake installed:)

conda activate <env_with_snakemake>

Then run the workflow:

snakemake --use-conda -j -s cellect-magma.snakefile --configfile config.yml

The first time you run the workflow, snakemake will download and install local conda environments in ./.snakemake. These environments ensure that all dependencies are correctly installed. CELLECT-MAGMA is unlikely to work without the --use-conda flag.

The above command is configured to output results in ./CELLECT-EXAMPLE. To change this open the config.yml file and edit the BASE_OUTPUT_DIR to specify the output directory.

The config file is preconfigured to prioritize the two CELLEX specificity inputs for each of the two GWAS datasets we just downloaded.

Running the workflow should take 5-15 minutes depending on the available number of cores on your system. Here we run the workflow using all available cores on the computer (-j). If you wish to use only 4 cores, just pass the -j 4 flag.

Output files

In ./CELLECT-EXAMPLE/CELLECT-MAGMA/results/prioritization.csv you will see the following prioritization output:

gwas,specificity_id,annotation,beta,beta_se,pvalue
BMI_Yengo2018,tabula_muris-test,Brain_Myeloid.macrophage,-0.4256467588277887,0.15332252362801435,0.005508326724381693
BMI_Yengo2018,tabula_muris-test,Bladder.bladder_urothelial_cell,-0.188053801356396,0.09730863662451754,0.05331287569706968
BMI_Yengo2018,tabula_muris-test,Brain_Non-Myeloid.Bergmann_glial_cell,0.17716734807808396,0.11863745666516574,0.1353690970000032
BMI_Yengo2018,tabula_muris-test,Brain_Myeloid.microglial_cell,-0.10624625281347268,0.11516979697400745,0.3562749948370449
BMI_Yengo2018,tabula_muris-test,Bladder.bladder_cell,0.008859755806741261,0.09622232003854314,0.9266391307919049
EA3_Lee2018,tabula_muris-test,Brain_Non-Myeloid.Bergmann_glial_cell,0.4354027792632111,0.10161198002978243,1.84060625936418e-05
EA3_Lee2018,tabula_muris-test,Brain_Myeloid.microglial_cell,-0.2521850171098417,0.09858996548153,0.010541078721614588
EA3_Lee2018,tabula_muris-test,Brain_Myeloid.macrophage,-0.2830042956883243,0.13141560219605242,0.03129683428396119
EA3_Lee2018,tabula_muris-test,Bladder.bladder_cell,-0.15575144440223648,0.08243094562674985,0.058849547596786116
EA3_Lee2018,tabula_muris-test,Bladder.bladder_urothelial_cell,-0.10833992179721944,0.08333654257289334,0.19361354671232014
BMI_Yengo2018,mousebrain-test,ABC,-0.37917532198245135,0.09899269867273812,0.00012852439743265472
BMI_Yengo2018,mousebrain-test,ACBG,-0.23269831434512225,0.1103692297554486,0.035017320727815764
EA3_Lee2018,mousebrain-test,ABC,-0.4101409247707607,0.08499030497989073,1.4094968870657675e-06
EA3_Lee2018,mousebrain-test,ACBG,-0.1494705796172095,0.09488802931956902,0.1152255823631138

See Input & Output for a full description for the output files.