Friday, March 18, 2022 - 10:00 am

DISSERTATION DEFENSE

  

Author : Gaofeng Pan

Advisor : Dr. Jijun Tang

Date : March 18, 2022

Time 10:00 am

Place : Virtual (Teams link below)

 

Meeting Link

 

Abstract

Deep Learning had been widely used in computational biology research in past few years.  A great amount of deep learning methods was proposed to solve bioinformatics problems, such as gene function prediction, protein interaction classification, drug effects analysis, and so on; most of these methods yield better solutions than traditional computing methods.  However, few methods were proposed to solve problems encountered in evolutionary biology research. In this dissertation, two neural network learning methods are proposed to solve the problems of genome location prediction and median genome generation encountered in phylogenetic tree construction; the ability of neural network learning models on solving evolutionary biology problems will be explored. 

Phylogenetic tree represents the evolutionary relationships among genomes in intuitive ways, it could be constructed based on genomics phenotype and genomics genotype. The research of phylogenetic tree construction methods is meaningful to biology research. In the past decades, many methods had been proposed to analyze genome functions and predict genome subcellular locations in cell. However, these methods have low accuracy on multi-subcellular localization. In order to improve prediction accuracy, a neural network learning model is defined to extract genome features from genome sequences and evolution position matrix in this research. Experiment results on two widely used benchmark datasets show that this model has significant improvements than other currently available methods on multi-subcellular localization; deep neural network is effective on solving genome location prediction.  

Phylogenetic tree construction based on genomics genotype has more accurate results than construction based on genomics phenotype. The most famous phylogenetic tree construction framework utilizes median genome algorithms to filter tree topology structure and update phylogenetic ancestral genome. Currently, there are several median genome algorithms which could be applied on short genome and simple evolution pattern, however when genome length becomes longer and evolution pattern is complex these algorithms have unstable performance and exceptionally long running time. In order to lift these limitations, a novel median genome generator based on graph neural network learning model is proposed in this research. With graph neural network, genome rearrangement pattern and genome relation could be extracted out from internal gene connection. Experiment results show that this generator could obtain stable median genome results in constant time no matter how long or how complex genomes are; its outstanding performance makes it the best choice in GRAPPA framework for phylogenetic tree construction.