Skip to content

Outlier Check and Removal

Project ID: 271

Forked repo used to filter the results generated with, which finds outlier sequences in multiple sequence alignments on the amino acid level.

To be modified: several of the scripts are tailored to sequence headers labeled with AD followed by numbers ranging from 0 to 9, repeating it 2 times and a maximum of 3 times. Likewise, it searches for orthologous groups from OrthoDB v10.1, which always include "at6447" in their identifier. Additionally, to remove the outliers some scripts parse the FASTA files with the suffix .aa.mafft.fas.