IMSA+A is an extension of Arron-IMSA program for metagenomic taxonomic classificiation using RNAseq reads.
IMSA+A should be installed into it's own directory, independent of IMSA. See Detailed Directions.
- We have learned through our own efforts that there are many identical DNA sequences between HG38 (human genome) and many Microbial genomes. Ultimately, matching contigs can not be verified as either human or microorganim, as they are in both genomes. If you have such experiences, please contact me. Your experience may help towards a resolution.
- Expanded tips and tricks in directions
- Include example results to validate pipeline
- We found a bug in IMSA paired end processing. We are working to correct and will release soon.
- Fix released.
- Starting with DEC 2016 update, IMSA code is distributed with IMSA+A.
- Very minor updates for compatability are included; said changes are documented with comments.
- A new script "postprocesscount4acc.py" allows use of accession number data. New options in systemSettings.py, start with "ACC_" to point to a new accession version to taxon database.
https://www.ncbi.nlm.nih.gov/news/03-02-2016-phase-out-of-GI-numbers/ GI numbers are being phased out (09/2106). IMSA and IMSA+A counting scripts count using GI numbers. Any files downloaded here together will function, but updated NCBI databases after 09/2016 will not. The hand off to accession.version should be relatively painless and require very minor code changes. However, IMSA+A will need NCBI to release an accession.version database dump in the same way that they do a GI dump. Until then, IMSA+A is not future compatible.