Notice

MOCAT is no longer being actively developed and supported
Please consider switching to newer tools such as NGLess and NGLess meta profilers in conjunction with the mOTUs profiler, the GMGC gene catalog and the EggNOG database and mapper.

MOCAT 2

MOCAT2 (metagenomic analysis toolkit) is a package for analyzing metagenomics datasets. Currently MOCAT2 supports Illumina single- and paired-end reads in raw FastQ format. Using MOCAT2 you can generate taxonomic and functional profiles, as well as assemble reads and predict genes in assembled sequences. The official MOCAT2 page is http://mocat.embl.de/

Unfortunately, development of MOCAT2 has stopped. Here is the MOCAT2 Bioinformatics paper (http://bit.ly/1VbJnzi). This Github repo contains the latest MOCAT2 version.

HOW TO INSTALL

Clone the repository.
Use the files in stable/2.1.3. Run the setup.MOCAT2.pl script.
To use the functional annotation step, you need to download and extract the required data file into the MOCAT folder: http://vm-lux.embl.de/~kultima/share/MOCAT/v2.0/MOCAT2_data_files.zip (4GB)

OUTPUT FILES

Functional profiles These are gene coverages summarized at a higher level, such as a KEGG KO, module or pathway level or eggNOG OGs. Genes are summarized at these categories based on a mapping file in the MOCAT/data folder (.functional.map). This means, that even though named functional profiles, these can be summarized at other user-defined levels, such as species, genera or phyla or even function and taxonomic representation such as KO.species or KO.genus.

Taxonomic profiles Taxonomic profiles come in two flavors: mOTU and NCBI. Each of these requires a specific set of mapping files and specific requirements for the database structure. The current version of MOCAT2 ships with a database for each of the two flavors: mOTU.v1 and RefMG.v1, respectively.

mOTU profiles These are generated by first mapping and filtering reads to the mOTU.v1 database and then in the profiling option selecting -mode mOTU. The abundances of 10 marker genes are summarized into (annotated) mOTU linkage groups (mOTU-LGs).

NCBI profiles By mapping and filtering reads against the RefMG.v1 database, the profiling step with option -mode NCBI will summarize the gene abundances into NCBI taxa level coverages: phylum to species, including specI (Mende et al., 2013) coverages.

Different output formats Both insert and base coverages are calculated in MOCAT. An insert is defined as either a single read or a matching read-pair. Furthermore, each of these two coverage types are calculated as raw counts, gene length normalized coverages (norm), and scaled gene length normalized coverages (scaled). Scaled files are gene normalized abundances multiplied by a scaling factor. These files should be utilized when the -1 fraction (i.e. inserts or bases that do not map to the database) is important. All values have been re-scaled (with their respective fraction of the total constant) so that the value of the -1 fraction is the same as in the raw files. This enables the possibility to have gene length normalized counts at the same time as utilizing the -1 fraction. Finally, as a third layer, bases and inserts from reads mapping to more than one gene (i.e. multiple mappers) with the same alignment score are distributed evenly or according to the abundance of bases and insert mapping uniquely (mm.dist.among.unique). MOCAT2 also saves the abundances of genes based on reads mapping to only one unique location (only.unique). The permutations of these options results in the following files as listed below. =item Which of these files should you use? Below we have listed some recommended uses of the different files, but in general we recommend using the mm.dist.among.unique files.

In the sclaed files, gene length normalized insert/base abundances are multiplied by the abundance-weighted average gene length. This enable sthe use of the -1 fraction (as it is constant). If the -1 fraction is irgnored, and relative abundances are used, the results will be the same for the norm and scaled files.

Multiple mappers distributed evenly <base|insert>.raw: inserts used as input to e.g. DESeq2 <base|insert>.norm: base.norm would represent the most commonly used gene length normalized base counts. This was used in (Zeller, et al., 2014) <base|insert>.scaled: If profile abundances are used (total row sum is 1), using these files would yield same results as using the .norm files.

Multiple mappers distributed according to unique <base|insert>.mm.dist.among.unique.raw: this could also be used in DEseq2 <base|insert>.mm.dist.among.unique.norm: For normal use, we would recommend using these files <base|insert>.mm.dist.among.unique.scaled: these are the values which taxonomic profiles are calculated upon, and should you require using the -1 fraction these files should be used

Coverages based only on uniquely mapping reads These files would be used in instances where its important to discard reads mapping to multiple genes. <base|insert>.only.unique.raw <base|insert>.only.unique.norm <base|insert>.only.unique.scaled

Retaining functional abundance tables To calculate abundances, in the case of functional profiles, we recommend dividing each feature value with the total number of mapped bases/inserts. Note that this may not necessarily sum up to 1, as each gene can be annotated to multiple functional features.

Name	Name	Last commit message	Last commit date
Latest commit unode Add notice about discontinuation of MOCAT2 Aug 5, 2020 c5293fc · Aug 5, 2020 History 58 Commits
dev	dev	Updated permissions	Mar 8, 2018
stable/2.1.3	stable/2.1.3	Updated permissions	Mar 13, 2018
.DS_Store	.DS_Store	rnd commit	Jan 3, 2017
.gitignore	.gitignore	minor fix	Jan 4, 2017
.project	.project	SCREEN: multiple reads semi implemented	May 25, 2016
LICENSE	LICENSE	Initial commit	May 19, 2016
README.md	README.md	Add notice about discontinuation of MOCAT2	Aug 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Notice

MOCAT 2

About

Releases

Packages

Contributors 2

Languages

License

mocat2/mocat2

Folders and files

Latest commit

History

Repository files navigation

Notice

MOCAT 2

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages