Explore projects
-
Updated
-
Updated
-
This repository includes a compressed archive with supplementary files generated for the Thesis written in partial fulfillment of the requirements for the Master of Science (OEP-Biology).
Updated -
BioCASe / VCAT-Transfer
GNU General Public License v2.0 or laterUpdated -
-
This is a fork with the necessary tools to generate reference FASTA files on the protein and gene levels and a modified orthology table, all of which are compatible with and necessary to annotate orthologs with Orthograph v0.6.3. Created to fix incompatibility issues between the information stored in the catalog from OrthoDB v10.1 and the protein ID's in the sequence headers from the RefSeq files from NCBI.
Updated -
ZFMK / ZFMKDigitizationStrategy
Creative Commons Attribution Share Alike 4.0 InternationalUpdated -
Filtering a multiple sequence alignment excluding positions where a taxa group is the only representative.
Updated -
A template to structure data, code and documentation for a project on the HPC-Cluster
Updated -
This is a Snakemake workflow that calculates maximum-likelihood gene trees with IQ-TREE and 100 bootstrap replicates for each tree. Then, consensus trees are produced and combined to infer a species tree following the multispecies coalescent model with ASTRAL.
Updated -
Endpoint for getting ASV taxonomy tables as BIOM JSON files. http://gensoft.pasteur.fr/docs/biom-format/2.1.5/index.html
Updated -
BioCASe / biocase_media
GNU General Public License v2.0 or laterZFMK's virtual collection catalogue transfer and dissemination package
Updated -
Forked repo used to filter the results generated with checker_complete.2.pl, which finds outlier sequences in multiple sequence alignments on the amino acid level.
To be modified: several of the scripts are tailored to sequence headers labeled with AD followed by numbers ranging from 0 to 9, repeating it 2 times and a maximum of 3 times. Likewise, it searches for orthologous groups from OrthoDB v10.1, which always include "at6447" in their identifier. Additionally, to remove the outliers some scripts parse the FASTA files with the suffix .aa.mafft.fas.
Updated -
This forked repository includes a compressed archive with the supplementary files, as well as the Thesis written in partial fulfillment of the requirements for the Master of Science (OEP-Biology) from the University of Bonn, Germany. Published originally with my maiden name, Júlia M. Q. Calvet
Thesis title: Evaluation of de novo transcriptome assemblers and their performance when reconstructing single-copy orthologous genes: the effects of complete sets of data when establishing relationships between dorid nudibranchs
Updated -
This is a fork that only includes the script to choose the optimal annotated transcriptome after gene orthology inference with Orthograph.
ortho-overlap.py is being updated to work with the results from BUSCO as well.
Updated -
Updated