-
Notifications
You must be signed in to change notification settings - Fork 59
PASA_Docker
If you have Docker installed, you can pull our image from DockerHub, which contains PASA and required software components.
Pull the latest Docker image for PASA like so:
% docker pull pasapipeline/pasapipeline
Given a target genome, transcripts, and PASA run-configuration file, you can run PASA from within Docker like so:
# here, $base_dir corresponds to your working directory that contains your input data.
# Replace $base_dir with your actual directory name (don't use it as a variable)
% docker run --rm -it -v /tmp:/tmp -v $base_dir:$base_dir \
pasapipeline/pasapipeline:latest \
bash -c 'cd /$base_dir && /usr/local/src/PASApipeline/Launch_PASA_pipeline.pl \
-c alignAssembly.conf -C -R --ALIGNER gmap -g genome.fa -t transcripts.cdna.fasta '
and just to give you a concrete example of how I do this in my own environment (with paths specified according to my project structure), my own docker command for running PASA on the provided sample data is:
% docker run --rm -it \
-v /tmp:/tmp \
-v /home/bhaas/GITHUB/pasapipeline/sample_data:/home/bhaas/GITHUB/pasapipeline/sample_data \
pasapipeline/pasapipeline:latest \
bash -c 'cd /home/bhaas/GITHUB/pasapipeline/sample_data \
&& /usr/local/src/PASApipeline/Launch_PASA_pipeline.pl \
-c sqlite.confs/alignAssembly.config -C -R \
--ALIGNER gmap -g genome_sample.fasta -t all_transcripts.fasta.clean'
and the provided sqlite.confs/alignAssembly.config is set up to use a SQLite database at /tmp/sample_mydb_pasa.sqlite
If you are going to try out the docker run
command as @brianjohnhaas suggests, do so this way
- Let's say you are in a directory called
/home/github_pasa
- git clone https://github.com/PASApipeline/PASApipeline.git
- you will now see PASApipeline/ under /home/github_pasa/
- Before the
docker run ....
do as follows
mkdir -p /home/work/temp
cd /home/work/
cp -r /home/github_pasa/PASApipeline/sample_data .
gunzip /home/work/sample_data/genome_sample.fasta.gz
-
Please do note that running the
docker run ...
command will add new files/folders to thesample_data
directory at/home/work/
- Run the
docker run ...
command from/home/work
where<docker_image:tag>
could either be adocker pull pasapipeline/pasapipeline:latest
or a custom docker built usingDockerfile
at the https://github.com/PASApipeline/PASApipeline/tree/master/Docker
- if you want to run PASA (align step) with
SQLITE
then do this
docker run --rm -it \
-v $PWD/temp:/tmp \
-v $PWD/sample_data:/home/bhaas/GITHUB/pasapipeline/sample_data \
pasapipeline/pasapipeline:latest \
bash -c ' \
cd /home/bhaas/GITHUB/pasapipeline/sample_data \
&& /usr/local/src/PASApipeline/Launch_PASA_pipeline.pl \
-c mysql.confs/alignAssembly.config -C -R \
--ALIGNER gmap -g genome_sample.fasta -t all_transcripts.fasta.clean'
- if you want to run PASA (align step) with
MySQL internally within docker
then do this
docker run --rm -it \
-v $PWD/temp:/tmp \
-v $PWD/sample_data:/home/bhaas/GITHUB/pasapipeline/sample_data \
pasapipeline/pasapipeline:latest \
bash -c 'service mysql start && \
cd /home/bhaas/GITHUB/pasapipeline/sample_data \
&& /usr/local/src/PASApipeline/Launch_PASA_pipeline.pl \
-c mysql.confs/alignAssembly.config -C -R \
--ALIGNER gmap -g genome_sample.fasta -t all_transcripts.fasta.clean'
Typically, you would need the same database for your downstream
annotation
step - hence if you are usingMySQL within docker
, you are better off using a workflow manager likeNextflow - https://www.nextflow.io/
. You would need to domysqldump sample_mydb_pasa > sample_mydb_pasa.sql
after thealign
step and thenmysql sample_mydb_pasa < sample_mydb_pasa.sql
. Alternatively, use a local installation of mysql and connect to it from within the docker container (see below). If you are wondering where the namesample_mydb_pasa
is coming from, it comes from theDATABASE
field inalignAssembly.config
https://github.com/PASApipeline/PASApipeline/blob/master/sample_data/mysql.confs/alignAssembly.config#L6
- If you have mysql running on your server and want to connect to it from the docker image, one way to do it is like so:
docker run --rm -it \
-v $PWD/temp:/tmp \
-v /var/run/mysqld/mysqld.sock:/var/run/mysqld/mysqld.sock \
-v $PWD/sample_data:/home/bhaas/GITHUB/pasapipeline/sample_data \
pasapipeline/pasapipeline:latest \
bash -c 'cd /home/bhaas/GITHUB/pasapipeline/sample_data \
&& /usr/local/src/PASApipeline/Launch_PASA_pipeline.pl \
-c mysql.confs/alignAssembly.config -C -R \
--ALIGNER gmap -g genome_sample.fasta -t all_transcripts.fasta.clean'
If you require a custom pasa_conf/conf.txt to connect to an external mysql server, you can set the path to this custom conf.txt file to env var PASACONF and PASA will use it instead of the default (which assumes localhost).

A Singularity image for PASA is available at https://data.broadinstitute.org/Trinity/CTAT_SINGULARITY/MISC/PASApipeline/.
Running the singularity image is much like running the docker image above. Software locations within the image are identical, as the singularity image is built directly from the docker one.
The syntax for executing PASA via the singularity image is like so:
singularity exec -B $PWD pasapipeline.simg /usr/local/src/PASApipeline/Launch_PASA_pipeline.pl ...remaining options as above ...
- PASA Pipeline Wiki Home
- Software Installation Instructions
- Running the Alignment Assembly Pipeline
- Leveraging RNA-Seq by the PASA Pipeline
- Build a comprehensive transcriptome database, integrating genome-guided and genome-free transcript reconstructions
- Genome annotation - comparisons and updates
- Alternative Splicing Analysis
- Other useful PASA applications
- Navigating PASA reports via Pasa Web
- Miscellaneous tidbits