-
Notifications
You must be signed in to change notification settings - Fork 1
03.LASTAL Alignment Pipeline
古郡 まさき edited this page May 30, 2024
·
2 revisions
The pipeline for genome ananlysis
Lastal was used to align. Install:
conda install -c bioconda last
Asuming you have 3 genomes, you should choose a ref FASTA Format genome and 2 query genome:
lastdb -P8 -uMAM8 myDB $ref
##first query
last-train -P40 --revsym -E0.05 -C2 myDB $query1 >$sp1.train
lastal -k32 -p $sp1.train myDB $query1 | last-split -fMAF+ >$sp1.maf
maf-swap $sp1.maf |last-split |maf-swap |maf-sort > $sp1.filter.maf
##second query
last-train -P40 --revsym -E0.05 -C2 myDB $query2 >$sp2.train
lastal -k32 -p $sp2.train myDB $query2 | last-split -fMAF+ >$sp2.maf
maf-swap $sp2.maf |last-split |maf-swap |maf-sort > $sp2.filter.maf
So you can get 2 MAF FORMAT files.
perl maf.rename.species.S.pl $sp1.filter.maf $sp1 $sp1.last.maf >$sp1.rename.maf
perl maf.rename.species.S.pl $sp2.filter.maf $sp2 $sp2.last.maf >$sp2.rename.maf
## multiz合并maf(>3个query according to phylogenetic tree for merging)
multiz M=1 $sp1.rename.maf $sp2.rename.maf 0 U1 U2 > $sp1-$sp2.ma
perl 01.convertMaf2List.pl $sp1-$sp2.maf $sp1-$sp2 $sp_ref $sp1,$sp2
perl 02.lst2gene.pl $sp1-$sp2 $gff
The uncode region filter script filter.pl
Usage
perl filter.pl [fasta] [filter_file]
for i in genes/*.filter; do echo -e "iqtree -s $i" >> iqtree.sh; done
for i in genes/*.treefile; do echo -e "nw_reroot $i imhausi > $i.nwk" >> reroot.sh; done
for i in genes/*.nwk; do echo -e "hyphy label-tree.bf --tree $i --list chr.list --output $i.label" >> label.sh ;done
hyphy absrel --alignment genes/evm.model.Contig11.1.fa.filter \
--tree genes/evm.model.Contig11.1.fa.filter.treefile.tree.nwk \
--branches Foreground \
-output genes/evvm.model.Contig11.1.fa.filter.result
hyphy busted --alignment genes/evm.model.Contig11.1.fa.filter \
--tree genes/evm.model.Contig11.1.fa.filter.treefile.tree.nwk \
--branches Foreground \
-output genes/evvm.model.Contig11.1.fa.filter.result
hyphy relax --alignment genes/evm.model.Contig11.1.fa.filter \
--test Foreground \
--tree genes/evm.model.Contig11.1.fa.filter.treefile.tree.nwk \
--output genes/evm.model.Contig11.1.fa.filter.relax
Psg_count.pl realx_count.pl are used for extract messages from hyphy output file.
hyphy: https://github.com/veg/hyphy