Skip to content

Using TOGA projections as evidence class in evidencemodeler #28

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gaurav-agavekar opened this issue Feb 28, 2022 · 7 comments
Closed
Labels
documentation Improvements or additions to documentation

Comments

@gaurav-agavekar
Copy link

Hello,

Thank you for developing and maintaining TOGA.

I created TOGA projections for one of my newly sequenced assemblies, which I wish to use in EVM for genome annotation. My other evidence types include RNA-seq and Iso-seq derived transcripts. I saw in some of your papers that you provided TOGA projections as evidence to EVM, so I was wondering if you would be willing to share a script to convert TOGA query_annotation output into a GFF3 file that EVM accepts? I tried a couple of tools for conversion, but it seems it might take more than just converting from BED to GFF3, so I thought I might ask if you already have a procedure developed for it.

Thanks in advance,
Gaurav

@MichaelHiller
Copy link
Collaborator

Hi Gaurav,

good point. Here is what we do
bedToGenePred $pathToTOGAresults/final.bed stdout | genePredToGtf file stdin stdout | gt gtf_to_gff3 -tidy | gt gff3 -tidy -sort -setsource $ref | evmFormatIsoSeq.pl > Evidences/$db.evm.toga.$ref.gff3

I have uploaded the perl script to TOGA/supply/evmFormatIsoSeq.pl
genePredToGtf and bedToGenePred are part of the UCSC kent src code.
Best
Michael

@gaurav-agavekar
Copy link
Author

Thanks so much, Michael; super appreciate it.

Best,
Gaurav

@DiegoSafian
Copy link

DiegoSafian commented Apr 5, 2023

Hi,
I am trying to run bedToGenePred $pathToTOGAresults/final.bed stdout | genePredToGtf file stdin stdout | gt gtf_to_gff3 -tidy | gt gff3 -tidy -sort -setsource $ref | evmFormatIsoSeq.pl > Evidences/$db.evm.toga.$ref.gff3

However, the command gt is not recognized and it is not clear to me what $ref means. Could you please clarify these points?

@MichaelHiller
Copy link
Collaborator

gt --help
Usage: gt [option ...] [tool | script] [argument ...]
The GenomeTools genome analysis system.

-i enter interactive mode after executing 'tool' or 'script'
-q suppress warnings
-test perform unit tests and exit
-seed set seed for random number generator manually.
0 generates a seed from current time and process id
-help display help and exit
-version display version information and exit

Tools:

bed_to_gff3

@osipovarev
Copy link
Member

@DiegoSafian if I could add to what Michael commented:
$ref is the reference genome you used to run TOGA. But in this case, it can be any label that would help EvidenceModeler to distinguish between sources of evidence. You could just use 'toga' label instead.

@kirilenkobm kirilenkobm added the documentation Improvements or additions to documentation label Apr 30, 2023
@Z-rose7
Copy link

Z-rose7 commented Mar 24, 2025

Hi Gaurav,

good point. Here is what we do bedToGenePred $pathToTOGAresults/final.bed stdout | genePredToGtf file stdin stdout | gt gtf_to_gff3 -tidy | gt gff3 -tidy -sort -setsource $ref | evmFormatIsoSeq.pl > Evidences/$db.evm.toga.$ref.gff3

I have uploaded the perl script to TOGA/supply/evmFormatIsoSeq.pl genePredToGtf and bedToGenePred are part of the UCSC kent src code. Best Michael

Hi, thank you for providing this command. However, when I ran it, the output file showed:
复制
skipping line 1 in file "stdin": unknown feature: "transcript"
skipping line 14 in file "stdin": unknown feature: "transcript"
skipping line 31 in file "stdin": unknown feature: "transcript"
I want to know what causes this and how to fix it. Thanks for your help!

@MichaelHiller
Copy link
Collaborator

Not sure, but I guess the gff3 format is not what the script expects.
I don't have much experience with gff3, which is a bad, less standardized format.
My suggestion is to take a single gff3 gene and debug this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

6 participants