-
Notifications
You must be signed in to change notification settings - Fork 22
CELLECT MAGMA Tutorial
This tutorial will take you through running CELLECT-MAGMA on two GWAS summary stats and two example expression specificity inputs.
cd ~/CELLECT
# ---------------------- STEP 1: prep GWAS data ----------------- #
wget https://portals.broadinstitute.org/collaboration/giant/images/c/c8/Meta-analysis_Locke_et_al%2BUKBiobank_2018_UPDATED.txt.gz -P example/
wget https://www.dropbox.com/s/ho58e9jmytmpaf8/GWAS_EA_excl23andMe.txt -P example/
conda env create -f ldsc/environment_munge_ldsc.yml
conda activate munge_ldsc
python ldsc/mtag_munge.py \
--sumstats example/GWAS_EA_excl23andMe.txt \
--merge-alleles data/ldsc/w_hm3.snplist \
--n-value 766345 \
--keep-pval \
--p PVAL \
--out example/EA3_Lee2018
python ldsc/mtag_munge.py \
--sumstats example/Meta-analysis_Locke_et_al+UKBiobank_2018_UPDATED.txt.gz \
--a1 Tested_Allele \
--a2 Other_Allele \
--keep-pval \
--p PVAL \
--merge-alleles data/ldsc/w_hm3.snplist \
--out example/BMI_Yengo2018
# --- STEP 2: Generate cell-type specificity input using CELLEX --- #
(CELLEX specificity files have been pre-generated for this tutorial)
# ---------------------- STEP 3: run CELLECT-MAGMA ----------------- #
conda activate <env_with_snakemake>
snakemake --use-conda -j -s cellect-magma.snakefile --configfile config.yml
Analogical to the Step 0 of CELLECT LDSC Tutorial.
See the Step 1 of CELLECT LDSC Tutorial.
See the Step 2 of CELLECT LDSC Tutorial.
Now we will run the workflow. Remember, running CELLECT-MAGMA requires having the snakemake library available (e.g. activate an environment with snakemake installed:)
conda activate <env_with_snakemake>
Then run the workflow:
snakemake --use-conda -j -s cellect-magma.snakefile --configfile config.yml
The first time you run the workflow, snakemake will download and install local conda environments in ./.snakemake
. These environments ensure that all dependencies are correctly installed. CELLECT-MAGMA is unlikely to work without the --use-conda
flag.
The above command is configured to output results in ./CELLECT-EXAMPLE
. To change this open the config.yml
file and edit the BASE_OUTPUT_DIR
to specify the output directory.
The config file is preconfigured to prioritize the two CELLEX specificity inputs for each of the two GWAS datasets we just downloaded.
Running the workflow should take 5-15 minutes depending on the available number of cores on your system. Here we run the workflow using all available cores on the computer (-j
). If you wish to use only 4 cores, just pass the -j 4
flag.
In ./CELLECT-EXAMPLE/CELLECT-MAGMA/results/prioritization.csv
you will see the following prioritization output:
gwas,specificity_id,annotation,beta,beta_se,pvalue
BMI_Yengo2018,tabula_muris-test,Brain_Myeloid.macrophage,-0.4256467588277887,0.15332252362801435,0.005508326724381693
BMI_Yengo2018,tabula_muris-test,Bladder.bladder_urothelial_cell,-0.188053801356396,0.09730863662451754,0.05331287569706968
BMI_Yengo2018,tabula_muris-test,Brain_Non-Myeloid.Bergmann_glial_cell,0.17716734807808396,0.11863745666516574,0.1353690970000032
BMI_Yengo2018,tabula_muris-test,Brain_Myeloid.microglial_cell,-0.10624625281347268,0.11516979697400745,0.3562749948370449
BMI_Yengo2018,tabula_muris-test,Bladder.bladder_cell,0.008859755806741261,0.09622232003854314,0.9266391307919049
EA3_Lee2018,tabula_muris-test,Brain_Non-Myeloid.Bergmann_glial_cell,0.4354027792632111,0.10161198002978243,1.84060625936418e-05
EA3_Lee2018,tabula_muris-test,Brain_Myeloid.microglial_cell,-0.2521850171098417,0.09858996548153,0.010541078721614588
EA3_Lee2018,tabula_muris-test,Brain_Myeloid.macrophage,-0.2830042956883243,0.13141560219605242,0.03129683428396119
EA3_Lee2018,tabula_muris-test,Bladder.bladder_cell,-0.15575144440223648,0.08243094562674985,0.058849547596786116
EA3_Lee2018,tabula_muris-test,Bladder.bladder_urothelial_cell,-0.10833992179721944,0.08333654257289334,0.19361354671232014
BMI_Yengo2018,mousebrain-test,ABC,-0.37917532198245135,0.09899269867273812,0.00012852439743265472
BMI_Yengo2018,mousebrain-test,ACBG,-0.23269831434512225,0.1103692297554486,0.035017320727815764
EA3_Lee2018,mousebrain-test,ABC,-0.4101409247707607,0.08499030497989073,1.4094968870657675e-06
EA3_Lee2018,mousebrain-test,ACBG,-0.1494705796172095,0.09488802931956902,0.1152255823631138
See Input & Output for a full description for the output files.