NanoPrapi

Nanopore sequencing based on Oxford Nanopore Technologies (ONT) and Pacific BioSciences (PacBio) single-molecule real-time (SMRT) long-read isoform sequencing (Iso-Seq) have shown great potential in detecting RNA modification and post-transcriptional regulation. This is a comprehensive computational procedure for the quantification of RNA modification in single-base resolution based on DRS data. Moreover, we also provided procedure on the identification of alternative splicing (AS) and alternative polyadenylation (APA) based on both DRS and PacBio Iso-Seq data. The entire step was based on two packages (Nanom6A and PRAPI), which were based on Python language on Linux system.

This pipeline include nanom6A(https://github.com/gaoyubang/nanom6A) to identify m6A in single-base

and

PRAPI(http://forestry.fafu.edu.cn/tool/PRAPI/help.php) to detect AS and APA events.

Overview of the workflow:

This is the workflow to show a step-by-step pipeline to identify m6A, AS and APA with long reads data.

Installation

Running environment:
- The workflow was constructed based on the centos and ubuntu.
Required software and versions:
- Anaconda or Miniconda
- Guppy v3.6.1
- Ont_fast5_api v0.3.2
- Tombo v1.5.1
- Nanom6A 2021_3_18 version
- PRAPI
- LoRDEC v0.9
- Picard v2.26.8
- GMAP version 2019-12-01
- python

Input Data

1.For m6A identification based on nanom6A, input data should be genome file, transcriptome reference sequences and fast5 file generated by using Oxford Nanopore platform.

2.For identification of AS and APA based on PRAPI, input file should include genome file, annotation file (genePred format), long reads data, which was generated by Third Sequcencing Technologies and corrected by LoRDEC.

nanom6A

Example FAST5 file in nanom6A: input/nano/Test.fast5
Example genome file in nanom6A : input/nano/genome.fa
Example bed6 file in nanom6A: input/nano/anno.bed

The fast5 file from Nanopore DRS, which included raw signal, is stored in HDF5 format and could be viewed by HDFView. This is a typical fast5 file:

Each entry in a FASTA files consists of 2 lines:

A sequence identifier with information about the sequencing run and the cluster. The exact contents of this line vary by based on the BCL to FASTQ conversion software used.
The sequence (the base calls; A, C, T, G and N).

The first entry of the input data:

>chr
cctagcacaTTGAGTTTCATCTCATAACCCCCAGGCCTCTTTCCCCCTCCAACTTCATAGGCTTGATCCACTTATTAG...

The bed file anno.bed corresponding to each reference transcript in gene:

chrom   st  ed  name    .   strand
chr7    10000 13453 ACTB    .   -

PRAPI

Example long read file in PRPAI: input/prapi/data/pacbio.fa
Example genome file in PRPAI: input/prapi/data/new.fa
Example conf file in PRPAI: input/prapi/conf.txt
Example RNA-seq bam file in PRPAI: input/prapi/data/new_bam/*bam
Example annotation file in PRPAI: input/prapi/data/phe.txt

Configuration file conf.txt can be edited by Vim text Editor in Linux system and contains several important parameters:

-Long read: PacBio Iso-seq or DRS long reads with FASTA format

-Genome_Annotion file phe.txt: Reference annotation with GenePred format

Major steps

Install the dependence

sh workflow/1_install_environment.sh
conda activate prapi_env
pip install -i https://pypi.anaconda.org/gaoyubang/simple splicegrapher

Usage for Nanom6A

Step 1: Download nanom6A package

Download nanom6A_2021_3_18.tar.gz package can be downloaded from following link: https://drive.google.com/drive/folders/1Dodt6uJC7lBihSNgT3Mexzpl_uqBagu0?usp=sharing

Make sure the package and the script in the same directory

Step 2: Identification of modified nucleotide using nanom6A

sh workflow/2_run_nanom6A.sh

Usage for PRAPI

Identification of AS and APA

sh workflow/3_run_prapi.sh

Expected results

The result directory of nanom6A includes ratio.x.tsv which contains the information of gene name, chromosome, the coordinate site of m6A, the number of m6A modified reads, the number of total reads, and the ratio of the m6A site. The file named genome_abandance.x.bed contains information of name and coordinate information of chromosome, gene name, ID and position of single FAST5 read and motif (kmer). The x in ratio.x.tsv and abandance.x.bed represents the probability of modification. The default probability is 0.5. And result including ratio.x.tsv and abandance.x.bed can be plotted in this command:

nanoplot --input /prediction step output/  -o plot

for example:

nanoplot --input result_final  -o plot_nano_plot

The output figure provides structure of transcripts and m6A sites highlighted by purple vertical line

In PRAPI, The visualization of AS and APA contains two categories in the output directories: Annotation_Gene and Novel_Gene, which represented long reads located in annotated region and unannotated region, respectively. For example, the graph from Potri.002G178700 shows AS and APA events:

Name	Name	Last commit message	Last commit date
Latest commit GuInNGS Create tombo.yaml Sep 12, 2022 14f385f · Sep 12, 2022 History 119 Commits
graphs	graphs	Add files via upload	Feb 12, 2022
input	input	Add files via upload	Feb 26, 2022
output	output	Add files via upload	Feb 12, 2022
workflow	workflow	Update 2_run_nanom6A.sh	Sep 12, 2022
.DS_Store	.DS_Store	verions	Feb 7, 2021
.gitignore	.gitignore	clean the folder	Feb 7, 2021
LICENSE	LICENSE	Create LICENSE	Feb 26, 2022
README.md	README.md	Update README.md	Feb 26, 2022
nanom6a.yaml	nanom6a.yaml	Create nanom6a.yaml	Sep 12, 2022
template.Rproj	template.Rproj	updated the structure of the readme	Feb 7, 2021
tombo.yaml	tombo.yaml	Create tombo.yaml	Sep 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NanoPrapi

Overview of the workflow:

Installation

Input Data

nanom6A

PRAPI

Major steps

Install the dependence

Usage for Nanom6A

Step 1: Download nanom6A package

Step 2: Identification of modified nucleotide using nanom6A

Usage for PRAPI

Identification of AS and APA

Expected results

About

Releases

Packages

Contributors 3

Languages

License

GuInNGS/NanoPrapi

Folders and files

Latest commit

History

Repository files navigation

NanoPrapi

Overview of the workflow:

Installation

Input Data

nanom6A

PRAPI

Major steps

Install the dependence

Usage for Nanom6A

Step 1: Download nanom6A package

Step 2: Identification of modified nucleotide using nanom6A

Usage for PRAPI

Identification of AS and APA

Expected results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages