GitHub

Getting Started

Getting Started

# index reference genome
lra index -CCS/CLR/ONT/CONTIG ref.fa
# Map sequence to reference
lra align -CCS/CLR/ONT/CONTIG ref.fa read.fa -t 16 -p s > output.sam

Introduction

lra is a sequence alignment program that aligns long reads from single-molecule sequencing (SMS) instruments, or megabase-scale contigs from SMS assemblies. lra implements seed chaining sparse dynamic programming with a concave gap function to read and assembly alignment, which is also extended to allow for inversion cases. lra alignment approach increases sensitivity and specificity for SV discovery, particularly for variants above 1kb and when discovering variation from ONT reads, while having runtime that arecomparable (1.05-3.76×) to current methods. When applied to calling variation from *de novo* assembly contigs, there is a 3.2% increase in Truvari F1 score compared to minimap2+htsbox.

Users' Guide

Installation

Install lra by bioconda: conda install -c bioconda lra

Install lra from github or release: The dependencies are zlib, htslib. Users can install zlib and htslib through conda and build lra in conda environment.

conda activate env;
Install dependency: conda install -c bioconda htslib and conda install -c anaconda zlib;
Get released latest source code from github wget https://github.com/ChaissonLab/lra/archive/VX.XX.tar.gz && tar -xvf VX.XX.tar.gz && cd lra-X.XX/ && make. Or get source code directly from the master branch git clone --recursive https://github.com/ChaissonLab/lra.git -b master && cd lra && make. You are all set for the installation!

Index reference

lra needs to first build a two-tiered minimizer indexes (global and local) for the reference before mapping. Both can be built at once using commands:

lra index -CCS/CLR/ONT/CONTIG ref.fa

lra has different parameters setting for the index when aligning reads from different sequencing instruments (CCS/CLR/ONT/CONTIG). You can also custimize the parameters. Details see lra index --help. lra takes a few minutes to index the human reference genome.

Alternatively the global and local indexes may be built separately:

lra global -CCS/CLR/ONT/CONTIG ref.fa
lra local -CCS/CLR/ONT/CONTIG ref.fa

Align reads/contigs to reference

lra takes reads fasta, fastq or bam format in the mapping step. The output format can be SAM, PAF, BED and pairwise alignment. Details see lra align --help. The usage of multiple threads can be specified by -t. lra uses the same base algorithm for mapping all datatypes with different parameters settings. It is recommended to choose among CCS/CLR/ONT/CONTIG based on the accuracy and average length of the input reads.

lra align -CCS/CLR/ONT/CONTIG ref.fa read.fa -t 16 -p s > output.sam  
lra align -CCS/CLR/ONT/CONTIG ref.fa read.fa -t 16 -p p > output.paf  
lra align -CCS/CLR/ONT/CONTIG ref.fa read.fa -t 16 -p b > output.bed

If you have read.fa.gz, you may pip the read.fa to lra.

zcat read.fa.gz | lra align -CCS ref.fa /dev/stdin -t -p s > output.sam

Output format

lra uses a set of customized tags in SAM and PAF output.

Tag	Type	Description
NM	i	Number of mismatches + insertions + deletions in the alignment.
NX	i	Number of mismatches in the alignment.
ND	i	Number of bases of deletions in the alignment.
TD	i	Number of deletions in the alignment.
NI	i	Number of bases of insertions in the alignment.
TI	i	Number of insertions in the alignment.
NV	f	The alignment score.
TP	A	Type of aln, P/primary, S/secondary, I/inversion.
RT	i	runtime.
CG	z	CIGAR string.
AO	i	This number shows the order of the aligned segment when a read is split.

Name	Name	Last commit message	Last commit date
Latest commit mchaisso Merge pull request #36 from jo-mc/master May 26, 2023 6221610 · May 26, 2023 History 813 Commits
.github/workflows	.github/workflows	Update ccpp.yml	Apr 12, 2021
call_assembly_SVs	call_assembly_SVs	Update the script to call assembly SV	Apr 4, 2021
image	image	Update readme	May 24, 2021
logo	logo	Smaller logo	Apr 9, 2021
subprojects	subprojects	adding subprojects for meson build	Apr 12, 2021
.gitmodules	.gitmodules	Removing bwa dependency	Apr 12, 2021
AffineOneGapAlign.h	AffineOneGapAlign.h	1.Remove code to apply edlib to kband alignment;	Jun 1, 2020
Alchemy2.cpp	Alchemy2.cpp	Fixed paf format.	Mar 15, 2022
Alignment.h	Alignment.h	Fixed a bug where an alignment across multiple contigs was merged if …	Mar 15, 2023
AlignmentBlock.h	AlignmentBlock.h	1.make some format changes; 2. add Sparse dp	Mar 11, 2019
BasicEndpoint.h	BasicEndpoint.h	Modified MapRead to use GlobalChain (old blasr code with bug fix) ins…	Nov 27, 2018
Chain.h	Chain.h	Fix the bug that nextGenomeStart >= chromosome lengths	May 18, 2021
ChainRefine.h	ChainRefine.h	Fix a bug in refining cluster when matchEnd == matchStart	May 20, 2021
ClusterRefine.h	ClusterRefine.h	Bumped version number.	Aug 16, 2022
Clustering.h	Clustering.h	Fixed a bug where an alignment across multiple contigs was merged if …	Mar 15, 2023
CompareLists.h	CompareLists.h	Merge branch 'mapping_acc' of https://github.com/chaissonlab/lra into…	Apr 11, 2021
DivideSubByCol1.h	DivideSubByCol1.h	Change the clear() function. The original one takes more memory.	Feb 24, 2021
DivideSubByCol2.h	DivideSubByCol2.h	Change the clear() function. The original one takes more memory.	Feb 24, 2021
DivideSubByRow1.h	DivideSubByRow1.h	Change the clear() function. The original one takes more memory.	Feb 24, 2021
DivideSubByRow2.h	DivideSubByRow2.h	Change the clear() function. The original one takes more memory.	Feb 24, 2021
Fragment.h	Fragment.h	Was missing header	Feb 22, 2019
Fragment_Info.h	Fragment_Info.h	1. Output unaligned reads to sam output;	Feb 24, 2021
Genome.h	Genome.h	Remove remarks	May 20, 2021
GlobalChain.h	GlobalChain.h	Was missing global chain. Made naive dp skip smaller alignments to ma…	Dec 10, 2018
IndelRefine.h	IndelRefine.h	Fixed a bug reading past the end of alignment blocks.	May 14, 2021
Info.h	Info.h	Add the inversion code in sdp, and add code in the "merging clusters"…	Apr 29, 2019
Input.h	Input.h	Merge pull request #36 from jo-mc/master	May 26, 2023
LICENSE.txt	LICENSE.txt	Updated license. Removing variable files.	Oct 19, 2020
LinearExtend.h	LinearExtend.h	Added a fix for LinearExtend copying past the end of a contig.	Jun 9, 2021
LocalRefineAlignment.h	LocalRefineAlignment.h	Fixed a bug where an alignment across multiple contigs was merged if …	Mar 15, 2023
LogLookUpTable.h	LogLookUpTable.h	Build a new pipeline for low accuracy reads	Mar 18, 2021
MMIndex.h	MMIndex.h	Fixed a bug where an alignment across multiple contigs was merged if …	Mar 15, 2023
Makefile	Makefile	Fixed a bug where an alignment across multiple contigs was merged if …	Mar 15, 2023
MapRead.h	MapRead.h	Fix a typo	May 17, 2021
Map_highacc.h	Map_highacc.h	comment out code to find reverse matches between clusters in the refi…	Jun 3, 2021
Map_lowacc.h	Map_lowacc.h	Fix a typo	May 17, 2021
Mapping_ultility.h	Mapping_ultility.h	Fixed a bug where an alignment across multiple contigs was merged if …	Mar 15, 2023
MinCount.h	MinCount.h	Fix a bug related to accessing out of boundary in StoreMinimizer	Mar 30, 2021
NaiveDP.h	NaiveDP.h	Removing dependencies for seqan.	Apr 5, 2019
Options.h	Options.h	Changed SAM output to use = for match by default	May 11, 2021
Path.h	Path.h	Missing file.	Jul 4, 2018
Point.h	Point.h	fix bool uninitialize and CleanOffDiagonal issue	Apr 2, 2021
PrioritySearchTree.h	PrioritySearchTree.h	Was missing priority search tree	Dec 10, 2018
QueryTime.cpp	QueryTime.cpp	Fix the type in index	Sep 2, 2022
README.md	README.md	Fixed readme for correct NM description. The output was correct, docu…	Mar 15, 2022
Read.h	Read.h	Fixed missing null character for qv string.	Apr 5, 2021
RefineBreakpoint.h	RefineBreakpoint.h	Added a function to refine the gaps on the query sequence between bre…	Apr 16, 2021
SeqUtils.h	SeqUtils.h	Change the q-coefficient of MapQ since we are using local matches to …	Mar 24, 2021
Sorting.h	Sorting.h	Build a new pipeline for low accuracy reads	Mar 18, 2021
SparseDP.h	SparseDP.h	Fixed a bug where an alignment across multiple contigs was merged if …	Mar 15, 2023
SparseDP_Forward.h	SparseDP_Forward.h	Clear vector after usage	May 24, 2021
SplitClusters.h	SplitClusters.h	Loose the splitting clustering for contig	Apr 7, 2021
SubProblem.h	SubProblem.h	Add code to bypass splitting on clusters(etc) for CLR and ONT	Mar 15, 2021
SubRountine.h	SubRountine.h	Add assertions	May 14, 2021
TestAffineOneGapAlign.cpp	TestAffineOneGapAlign.cpp	Added buffering of affine alignment matrices.	Jun 1, 2020
TestGlobalChain.cpp	TestGlobalChain.cpp	Adding missing file	Feb 22, 2019
Timing.h	Timing.h	1. Output unaligned reads to sam output;	Feb 24, 2021
TupleOps.h	TupleOps.h	resolve conflict	Apr 1, 2021
Types.h	Types.h	Fix a bug in bit masking and in CompareList.h, delete the code of fil…	Jan 29, 2021
lra.cpp	lra.cpp	Update lra.cpp	Mar 27, 2023
meson.build	meson.build	Added tp flag to sam output.	May 5, 2021
overload.h	overload.h	missing file from sdp part	Mar 15, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Getting Started

Table of Contents

Introduction

Users' Guide

Installation

Index reference

Align reads/contigs to reference

Output format

About

Uh oh!

Releases 24

Packages

Uh oh!

Contributors 5

Languages

License

ChaissonLab/LRA

Folders and files

Latest commit

History

Repository files navigation

Getting Started

Table of Contents

Introduction

Users' Guide

Installation

Index reference

Align reads/contigs to reference

Output format

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 24

Packages 0

Uh oh!

Contributors 5

Languages

Packages