I am a Brazilian researcher/developer that likes a lot learning new technologies as well as accepting and accomplishing challenges.
I also have some hobbies like reading, learning new languages in Duolingo (French, German, Italian and Russian), swimming, drawing animals in a paper and buildings in AutoCad.
Topics I have been working with: bioinformatics , semantic web, machine learning, data analysis, natural language processing, back end, front end.
This project aims at selecting the best datasets, align dataset terms and executing datafusion on semantic web datasets in order to be in accord with the Linked Data Principles. It is composed of four main modules:
Tags: Java , semantic web, scoring function, Desktop application, natural language processing, HTML/CSS/Bootstrap, Javascript.
PredPrIn is a scientific workflow to predict Protein-Protein Interactions (PPIs) using machine learning to combine multiple PPI detection methods of proteins according to three categories: structural, based on primary aminoacid sequence and functional annotations. It is composed of three main modules:
Tags: Python , Gene Ontology, semantic similarity, scientific workflow, parallelization, machine learning.
Python pipelines to filter positive predicted protein interactions according to two criteria: (i) association rules of cellular components according to gold standard Protein Protein Interaction data from HINT and (ii) text mining on scientific papers published on Pubmed extracting sentences where the proteins in the PPIs appeared in an interaction context
Tags: Python , rules association, natural language processing, machine learning.
This pipeline contains a series of functions to filter small amino acid sequences (peptides) predicted by epitopes discovery tools. It parses files from BepiPred and Discotope tools and executes filtering and descriptive steps to refine the final list of epitopes. One of the modules (Epiminer) executes text mining on scientific papers directed to the search of these peptides in a context of epitopes prediction and immunology in order to check the originality of the user's epitopes.
Tags: Python , data analysis, results exportation and natural language processing.
1 Graduated in System's analysis and development at Fluminense Federal Institute, where I developed a web system to apply cognitive tests of memory and attention generating reports in the administration page. All the data about this kind of tests and the terminology is recorded following the linked data principles in order to encourage other systems to adopt the same terminology and facilitate the analysis and querying using the same file format.
2 Master's degree in Systems and Information at Military Institute of Engineering. I developed a dissertation about data interlinking on the web of data in the semantic web context, having as target since the ranking of the best datasets publicly available to link to some source without external links, till the data items mapping between the selected datasets and the source one, human validation using an online platform by crowdsourcing to validate the items mapping, and at the end making the data fusion between source dataset and the validated items.
3 Doctor's degree in Computational modelling at the National Laboratory of Scientific Computing where I developed a thesis about protein interaction prediction, using multiple evidence of biological information which gives different shadows about physical and functional associations between proteins and massive use of machine learning to combine detection methods and posterior classification.
4 I worked as a postdoctoral researcher at the National Laboratory of Scientific Computing, specifically, in the bioinformatics laboratory, from 12/2020 to 01/2022 analyzing huge amount of genomic data for the SARS-CoV-2 projects, using bioinformatic tools to map, align and annotate proteins. I also had the opportunity to learn and execute structural analysis on proteins and perform docking essays. Most of the analysis relied on data exploration, visualization, prediction and forecasting. I and the bioinformatics lab team produced research articles in order to share the findings. I also have experience of presenting short-term courses teaching Python language with biological study cases. In 2016 and 2019, I could participate in two international conferences to present two of my research articles.