Informace o projektu

Deep Learning for Genomic and Transcriptomic Pattern Identification

Kód projektu
4431
Období řešení
1/2020 - 12/2022
Investor / Programový rámec / typ projektu
EMBO (European Molecular Biology Organization)
Fakulta / Pracoviště MU
Středoevropský technologický institut

Research in my newly formed laboratory revolves around the utilization of novel Deep Neural Network approaches to identify patterns in genomic and transcriptomic regions harboring functional elements. Specifically, we focus on the characterisation of three categories of functional elements: short genomic functional elements (e.g. small RNA gene loci), transcriptomic functional elements (e.g. RNA Binding Protein binding sites), and small RNA driven transcriptomic functional elements (e.g. microRNA target sites). The identification of such functional elements using in silico methods has been a field of intensive research, but the current low precision of methods when scanning over large regions of the genome/transcriptome has confined practical implementation to a small minority of well studied and easy to identify elements (e.g. microRNA target sites), and heavily biased by the prior theoretical knowledge. My research instead focuses on less biased methods of modelling these complex biological processes from raw data (genomic or high-throughput sequencing) using Deep Learning architecture to achieve pattern identification precision levels at unprecedented levels. For genomic functional elements we have developed a novel training approach involving iterative background selection that has boosted the accuracy of small RNA identification orders of magnitude beyond the state of the art. For transcriptomic functional
elements, we will utilize characteristics of binding to train a Deep Learning model on CLIP-Seq data from hundreds of sequenced RBPs. We are exploring the interpretation of the trained model aspects in order to predict functionality of novel enigmatic RBPs based on their binding characteristics. Finally, for small RNA driven functional elements we utilize Deep Learning models to identify unbiased binding rules from chimeric CLIP-Seq reads beyond the theoretical biases existing in current models.