Output of GMGC-mapper

Explanation of the files in the output directory

Prodigal output

These three files are the output of prodigal (if GMGC-mapper was called in genome mode)

Hit Table (hit_table.tsv)

The results of the queries to the GMGC.

There are five columns in the file.

Alignment category

Genome bins (genome_bin.tsv)

Genome bins (MAGs) found in the results (and a count of how many genes are contained in them).

There are two columns in the file.

Note while not all GMGC unigenes are contained in a genome bin, some are contained in many. Thus, the total counts will not (except by coincidence) correspond to the number of genes queried.

Summary (summary.txt and runlog.yaml)

The file summary.txt provides a human-readable summary of the results, while runlog.yaml is a summary of run metadata (as a YaML file, it is both machine and human-readable).

The file summary.txt should be reproducible and running GMGC-mapper twice on the same input should produce the same results. By design, though, runglog.yaml includes information such as the time when the analysis was run which is not reproducible.