FELIPE VAZ PERES

I am a data scientist with a M.Sc. in bioinformatics and over seven years of experience in computational biology and large-scale data analysis.

I am particularly interested in computational approaches to cancer research, omics, non-coding RNA, and developing efficient & reproducible workflows.

RESEARCH

Sugarcane pan-omics research

Transcriptomic diversity across 50 genotypes

This section highlights two complementary studies conducted at the Computational, Evolutionary, and Systems Biology Laboratory. First, we built the sugarcane pan-transcriptome, revealing variability in protein-coding transcripts across 50 genotypes.

Then, during my master's, I constructed a multi-genotype ncRNA catalog and analyzed expression networks integrating coding and non-coding RNAs to expand our understanding of transcriptomic diversity.

My research on sugarcane pan-omics

"Look again at that dot. That's here. That's home. That's us." - Carl Sagan

The Pale Blue Dot, captured by Voyager 1 in 1990, showing Earth as a tiny speck in the vastness of space, a humbling reminder of our place in the cosmos.

sugarcane pan-RNAome

Characterization of sugarcane ncRNAs and lncRNAs, revealing variability, conservation, co-expression, and functional roles.

sugarcane pan-transcriptome

Framework for pan-transcriptome assembly in complex polyploid crops, supporting sugarcane breeding programs.

Computational Reproducibility

Building trust through reproducible workflows

Ensuring reproducibility remains one of the major challenges in computational biology. Many published results are difficult to replicate due to incomplete documentation, non-standardized workflows, or environment dependencies.

In my work I focus on developing robust workflows that adhere to open science and FAIR principles (Findable, Accessible, Interoperable, and Reusable).

Pipelines I’ve built for reproducible research

KAPT

Automated inference and annotation of the Kappaphycus alvarezii supertranscriptome.

T-M integration

Transcriptome–microbiome cross-correlation and host–microbial interaction inference.

R2C

Automated gene co-expression network construction and regulatory analysis.

seabed symphony

Pipeline for novel Biosynthetic Gene Cluster discovery in marine sediment microbiomes.

YAATAP

Snakemake pipeline for de novo transcriptome assembly and functional annotation.

Software Development

Open-source tools for the scientific community

My path into software development emerged from the biological sciences. Working with massive sequencing datasets, I realized that modern biology demands the ability to design, automate, and scale computational analyses across terabytes of information.

I am deeply committed to Free Software principles, believing that progress is best achieved when we have the freedom to run, copy, distribute, study, change, and improve the software that powers our research.

A selection of open-source tools and applications

ContFree-NGS

Software designed to remove contaminant sequences from NGS datasets.

paper-trackr

Tired of missing out on cool papers? stay up to date with paper-trackr!

HACKATHON

LBB 2025

Awarded 3rd place at the largest bioinformatics competition in Latin America, solving complex computational biology challenges.

Mendelics 2021

Awarded 3rd place by developing an automated variant calling pipeline in under 48 hours using real genomic data.

BIOHACK 2018

Awarded 2nd place by designing a synthetic biology bioremediation project presented at Brazil's largest biotechnology conference.

CONTACT