Using Machine Learning and Deep Learning Algorithms for Low Birthweight Prediction

Monday, August 26, 2024 - 09:00 am

Author : Yang Ren
Advisor : Dr. Dezhi Wu, IIT Dept. & Dr. Yan Tong, CSE Dept
Date : Aug 26th
Time:  9:00 am – 11: 00 am
Place : Teams

Link: https://teams.microsoft.com/dl/launcher/launcher.html?url=%2F_%23%2Fl%2…

 

Dial in by phone

+1 803-400-6044,,897438708# United States, Columbia
Find a local number
Phone conference ID: 897 438 708#

Abstract

          Low Birthweight (LBW) is a major public health issue resulting in increased neonatal mortality and long-term health complications. Traditional LBW analysis methods, focusing on incidence rates and risk factors through statistical models, often struggle with complex unseen data, and thus their effectiveness is limited in early prevention of LBW. As such, more advanced LBW prediction models are needed, so this dissertation delves into this important research area through proposing and examining novel machine learning (ML) and deep learning (DL) algorithms to more accurately predict LBW during the early stage of birthing individuals’ pregnancy period. This dissertation consists of three studies, which covers the following three major research topics.  

     The first topic focuses on the examination of the effectiveness and impact of various data rebalancing techniques for LBW prediction to solve extremely imbalanced data issues. Through this investigation, we established a foundational pipeline for LBW prediction, paving the way for further development and refinement in subsequent studies. This first study also included an extensive feature importance analysis to identify key factors in LBW classification, crucial to guiding targeted interventions to improve birth outcomes.
     The second topic aims to develop a novel longitudinal transformer-based LBW prediction framework, which integrates prenatal mothers’ historical health records and current pre-delivery data, making it possible to provide more comprehensive and relevant input features for LBW prediction. This framework’s ability to effectively process and analyze these diverse data inputs marks a more significant advancement than previous approaches that primarily focus on immediate pre-delivery factors. As a result, this enhanced model is proved to improve the accuracy of LBW predictions, and thus offering a more robust tool for more effective early intervention strategies.
     The third topic is to propose and examine a pioneering fusion framework that combines structured medical records with rich text-based data. This large language model (LLM)-based approach aims to explore and optimize the strengths of both quantitative and qualitative data sources, for enhancing the predictive accuracy and explainability of the LBW prediction models. By integrating diverse data types, this proposed method is expected to offer in-depth insights into the myriad factors contributing to LBW, potentially unveiling previously unrecognized and more granular risk factors to refine the prediction models further.
     In summary, this dissertation presents a comprehensive exploration of using advanced ML and DL algorithms in the prediction of LBW through a series of three studies. From establishing LBW prediction pipeline with rebalancing strategies (Study 1), developing a transformer-based approach (Study 2) to introducing a tabular-text fusion framework (Study 3), this research will contribute to a substantial advancement in prenatal care. By enabling earlier and more accurate identification of LBW risks, this work has the potential to transform prenatal intervention strategies, leading to improved health outcomes for both mothers and their infants.

Efficient Machine Learning on Scientific Data Using Bayesian Optimization

Monday, July 15, 2024 - 09:00 am
online

DISSERTATION DEFENSE

Author : Rui Xin

Advisor : Dr. Jijun Tang

Date : July 15, 2024

Time:  9:00 am – 11: 00 am

Place : Zoom

Link:https://zoom.us/j/94479902244?pwd=8XbYQPbZaxXXeBt4e1r5gqrBy6upb4.1

Meeting ID: 944 7990 2244

Passcode: 126908


Abstract

    Deep Learning is pivotal in advancing data analysis across various scientific fields, from genomics to materials discovery. Despite its widespread use, efficiently learning from limited data and operating under resource constraints remains a significant challenge, often limiting its full potential in environments where data is scarce or resources are restricted. This dissertation explores Active Learning and Automated Machine Learning (AutoML) powered by Bayesian Optimization to enhance the efficiency of machine learning across multiple disciplines. It focuses on algorithm optimization and data management through three interconnected studies.

In the first study, we investigate how data management technique - active learning helps discover new materials with target properties in limited dataset considering the vast chemical design space. We propose an active generative inverse design method that combines active learning with a deep autoencoder neural network and a generative adversarial deep neural network model to discover new materials with a target property in the whole chemical design space. Our experiments demonstrate that although active learning may select chemically infeasible candidates, these samples are beneficial for training robust screening models. These models effectively filter and identify materials with desired properties from those generated hypothetically by the generative model. The results confirm the success of our active generative inverse design approach.

In the second study, we explore cancer heterogeneity and specificity through the analysis of mutational signatures, using collinearity analysis and machine learning techniques. These techniques include either a decision tree-based ensemble model or a flexible neural network-based method with automated hyperparameter optimization, each customizing a neural network for individual sub-tasks. Through thorough training and independent validation, our results reveal that although the majority of mutational signatures are distinct, similarities between certain mutational signature pairs are observed through both mutation patterns and mutational signature abundance. These observations can potentially assist in determining the etiology of still elusive mutational signatures. Further analysis using machine learning approaches indicates specific mutational signature relevance to cancer types, with skin cancer showing the strongest specificity among all cancer types.

Finally, we analyze cancer heterogeneity by examining immune cell compositions in tumor microenvironments, using neural architecture search to develop tailored models for classification subtasks. By analyzing transcriptome profiles from 11,274 patients across 33 cancer types to identify 22 immune cell types, we employ deep learning to model outcomes for cancer type and tumor-normal distinctions, utilizing the Shannon index for immune cell diversity and Cox regression for prognostic evaluations. Our findings reveal significant immune cell differences between tumors and normal tissues, with some discrepancies in directional differences across cancers. Immune cell composition patterns modestly differentiate cancer types, with sixteen significant prognostic associations identified, such as in kidney renal clear cell carcinoma. Additionally, immune cell diversity shows marked differences in seven cancer types and correlates positively with survival in some cases, underscoring the lack of a universal standard across all cancers.

Multi-scale Deep Representation Learning in Synthetic Biology

Tuesday, May 7, 2024 - 09:00 am
Online

DISSERTATION DEFENSE


Author : Xiaoyi Liu

Advisor : Dr. Jijun Tang and Dr. Yan Tong

Date : May 07, 2024

Time:  9 am – 11: 00 pm

Place : Teams

Link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_NTFlYTIwM2EtMjc3…

Meeting ID: 236 573 306 493

Passcode: UTL2Gs


Abstract

 

Synthetic biology advances and combines the expertise of engineers and biologists, bridging the gap between engineering and natural life. Synthetic biology has been generally categorized into two broad branches by developing new biological components, networks, and systems to reprogram organisms. The first branch involves using synthetic molecules to mimic natural biological functions. The second branch focuses on assembling natural biological components in novel ways, aiming to produce systems with unique, practical functions. Thus, the de novo engineering of biological modules and synthetic pathways is used in related practical bioengineering applications, such as drug-targeting strategies and microbial product manufacturing. Therefore, synthetic biology represents a new paradigm in scientific exploration and innovation, with widely used implications for our understanding and optimization of biological systems.

Over the past decades, there has been a significant increase in the amount of available whole-genome sequencing data and experimental data due to the emergence of new automation technologies, such as high-content imaging, high-throughput screening, and sequencing. Given the growth of these data sets, researchers are unable to summarize these data simply from experience and memory. Thus, stable and efficient computational methods are required to integrate them to predict or reveal new phenomena or insights that have never been discovered. However, incomplete knowledge of metabolic processes impairs the accuracy of biological systems, hindering advancements in systems biology and metabolic engineering. Additionally, some fundamental challenges still remain. Firstly, problems in systems biology are often cross-scale and multi-modal, yet existing computational methods for problem definition and model design are often single-scale and single-modal. Secondly, biological systems are multi-scale, unbalanced, and noisy, making structuring and benchmarking this complicated data very difficult. Thirdly, most natural or valuable products' complete biosynthetic pathways are unknown. Thus, computer-aided biosynthesis planning holds significant value.

To address the above challenges, we introduce multi-scale deep learning-based representation learning methodologies to understand and optimize the downstream tasks in systems biology, such as metabolic pathway inference, missing reaction prediction in GEMs, and retrosynthesis prediction. Specifically, our first study introduces a novel Multi-View Multi-Label learning framework for Metabolic Pathway Inference (MVML-MPI), which outperforms State-Of-The-Art (SOTA) methods by accurately representing the complex relationships between compounds and pathways. In the second study, to address the limitation of incomplete metabolic knowledge in GEMs, we proposed a novel framework named hypergraph Convolution network and attention mechanism integrated Explorer for GAPS prediction of metabolism termed CLOSEgaps. It is a comprehensive deep learning-driven tool that represents the hyper-topological information of GEMs and effectively fills gaps through hyperlink prediction, thereby enhancing the accuracy of phenotypic predictions. In the third study, we proposed a novel end-to-end framework for one-step retrosynthesis that combines the power of a graph encoder, which integrates learnable structural information, with the capability to sequentially translate drugs, thereby efficiently capturing chemically plausible information (RetroCaptioner). This research presents an advancement in systems biology by introducing a suite of multi-scale deep learning methodologies. These methodologies tackle key challenges such as MVML-MPI enhancing our understanding of complex metabolic pathways, CLOSEgaps innovatively filling gaps in metabolic models, and RetroCaptioner facilitating the process of retrosynthesis. Taken together, they form a comprehensive and integrated approach, and our proposed methods significantly advance the capabilities of synthetic biology.

Looking at continual learning through a dynamical system point of view

Friday, April 19, 2024 - 02:15 pm
Online

Krishnan Raghavan

Abstract:
One of the critical features of an intelligent system is to continually execute tasks in a real-world environment. As a new task is revealed, we seek to efficiently adapt to a new task (improve generalization) and, in the process of generalization, we seek to remember the previous tasks (minimize catastrophic forgetting). Consequentially, there are two key challenges that must be modeled: catastrophic forgetting and generalization. Despite promising methodological advancements, there is a lack of a theoretical approach that enable analysis of these challenges.

In this talk, we discuss modelling and analysis of continual learning using tools from differential equation theory. We discuss the broad applicability of our approach and demonstrate the many applications where such an approach is required. We will derive methods in some of these applications using this point of view and show the effectiveness of such approaches in modelling these applications.

Bio:
I am an assistant computational mathematician with the mathematics and computer science division at Argonne national laboratory. I received my Ph.D. in computer engineering from missouri university of science and technology in 2019 and have been at Argonne since then. My primary research agenda is to develop a mathematical characterization of machine learning (ML) models, their learning/training behavior and the associated precision achieved by them. Towards this end, I study the two broad facets of ML: theory; through the eyes of tools from systems theory, statistics and optimization; and applied; by building AI/ML models to solve key problems in nuclear physics, material science, HPC and more recently climate. I enjoy rock climbing, outdoors, cycling, love ramen and many other nerdy things including but not limited to fantasy fiction novels -- go Malazan.

Details here.

Casual analysis & decision intelligence for manufacturing at Bosch

Friday, April 12, 2024 - 02:15 pm
Online

Bosch is a multinational engineering and technology company that develops products in various business sectors, including mobility, industrial technology, energy and building technology, and consumer goods. Currently, Bosch employs over 427K workers and generates ~100B/yr in sales revenue. At the Bosch Center for Artificial Intelligence, in Pittsburgh, we focus on research in the area of neuro-symbolic AI, combining machine learning with knowledge engineering technologies. In this talk, we will illustrate recent efforts in the areas of causal analysis and decision intelligence to improve industrial manufacturing processes. More specifically, we discuss the application of neuro-symbolic methods for (1) root-cause analysis and (2) cognitive architectures for decision making.

About the authors
Alessandro Oltramari is president of the Carnegie Bosch Institute and a senior research scientist at Bosch Center for Artificial Intelligence in Pittsburgh, USA. Oltramari joined Bosch Research in 2016, after working as a research associate at Carnegie Mellon University, funded by public agencies like DARPA, NSF, ARL. At Bosch Research, he focuses on neuro-symbolic AI. His primary interest is to investigate how knowledge-based methods and systems can be integrated with learning algorithms, and help humans and machines make sense of the physical and digital worlds. Contact him at alessandro.oltramari@us.bosch.com

Cory Henson is a lead research scientist at the Bosch Center for Artificial Intelligence in Pittsburgh, USA. His research focuses on knowledge representation and neuro-symbolic AI methods, integrating machine learning with prior domain knowledge. He has led projects to develop and apply this technology for improving autonomous systems, ranging from automated driving to smart manufacturing. More recently, he has become interested in the use of neuro-symbolic methods for representing, learning, and reasoning with causal knowledge. Contact him at cory.henson@us.bosch.com

Details at: https://www.linkedin.com/events/7183461938431983616/about/

The LLM Journey

Friday, March 29, 2024 - 02:15 pm
Zoom or in person at SWGN 2A27.

Abstract:
Large Language Models (LLMs) have dramatically transformed the landscape of Generative AI, making profound impacts across a broad spectrum of domains. From enhancing Recommender Systems to advancing the frontiers of Natural Language Processing (NLP), LLMs have become indispensable. Their versatility extends into specialized sectors, such as finance with the development of BloombergGPT, and healthcare through MedLlama, showcasing their adaptability and potential for industry-specific innovations.

In this presentation, we will embark on a comprehensive exploration of the evolution of Large Language Models. Our journey will trace the origins of LLMs, highlighting key milestones and breakthroughs, and proceed to examine the latest advancements and research directions in the field. To mirror the structured and layered nature of LLMs themselves, our discussion will be organized into distinct sections. We'll begin with the foundational aspect of prompting, delve into the intricacies of their architecture, and discuss pivotal strategies such as Pretraining, Fine-tuning, and Parameter Efficient Fine-Tuning (PEFT). Furthermore, we'll address the challenges and solutions related to the mitigation of hallucination, a critical aspect of ensuring the reliability and accuracy of LLM-generated content.

Speaker Bio:

Vinija Jain brings to the table an extensive background in machine learning, with significant expertise in developing recommender systems at Amazon and spearheading NLP initiatives at Oracle. Her passion for artificial intelligence was ignited during her time in the Stanford AI program, which served as a catalyst for her deep dive into the field. Currently, Vinija is actively engaged in fundamental research and collaborates with the Artificial Intelligence Institute of South Carolina (AIISC) at the University of South Carolina. Her latest work with AIISC on AI-Generated Text Detection has been recognized with an outstanding paper award at EMNLP '23, underscoring her contributions to advancing AI research and application

 

https://www.linkedin.com/events/thellmjourney7178098223411036160/about/

Realtime Machine Learning on Edge AI Accelerators

Friday, March 22, 2024 - 02:15 pm
SWGN 2A27

Abstract: Several real-world applications of machine learning (ML) systems such as robotics, autonomous cars, assistive technologies, smart manufacturing, and many other Internet-of-Things (IoT) applications require real-time inference with low energy consumption. The surge in demand for specialized hardware for AI applications has resulted in a rapidly expanding industry for edge AI accelerators. Anticipating this trend, several companies have developed their own specialized accelerators such as the NVIDIA Jetson Nano, Intel NCS2, and Google TPU. While many conventional neural networks can be readily deployed on many of these platforms, the support for deploying more advanced and larger models such as transformers on them has yet to be researched and developed. In this talk, we discuss two of our recent projects in which we utilize optimization mechanism such neural architecture search (NAS) and system-level innovations such as modifying the computational graphs, partitioning, and refactoring the unsupported operations to efficiently deploy ML models on edge accelerators for computer vision and natural language processing tasks.

Bio: Dr. Ramtin Zand is an assistant professor of the Computer Science and Engineering and the principal investigator of the Intelligent Circuits, Architectures, and Systems (iCAS) Lab at the University of South Carolina. The iCAS lab has close collaborations with and is supported by several multinational companies including Intel, AMD, and Juniper Networks, as well as federal agencies such as National Science Foundation (NSF). Dr. Zand has authored more than 50 journal and conference articles and two book chapters and received recognition from ACM/IEEE including the best paper runner-up of ACM GLSVLSI’18, the best poster of ACM GLSVLSI’19, and best paper of IEEE ISVLSI’21, as well as featured paper in IEEE Transactions on Emerging Topics in Computing. He has received the NSF CAREER award in 2024. His research focus is on neuromorphic computing, edge computing, processing-in-memory, and AI/ML hardware acceleration.

 

Online at: https://us06web.zoom.us/j/8440139296?pwd=b09lRCtJR0FCTWcyeGtCVVlUMDNKQT09&omn=85050122519

Automated Data-flow optimization for Digital Signal Processors

Monday, March 18, 2024 - 10:00 am
online

DISSERTATION DEFENSE

Department of Computer Science and Engineering

University of South Carolina

 

Author : Madushan Abeysinghe
Advisor : Dr. Jason Bakos
Date : March 18, 2024
Time:  10 am – 11: 00 pm

Place : Teams

Meeting ID: 234 681 711 471

Passcode: wiatfJ

 

Abstract

 

Digital signal processors (DSP), which are characterized by statically-scheduled Very Long Instruction Word architectures and software-defined scratchpad memory, are currently the go-to processor type for low-power embedded vision systems, as exemplified by the DSP processors integrated into systems-on-chips from NVIDIA, Samsung, Qualcomm, Apple, and Texas Instruments. DSPs achieve performance by statically scheduling workloads, both in terms of data movement and instructions. We developed a method for scheduling buffer transactions across a data flow graph using data-driven performance models, yielding a 25% average reduction in execution time and a reduction of up to 85% DRAM utilization for randomly-generated data flow graphs. We also developed a heuristic instruction scheduler that serves as a performance model to guide the selection of loops from a target data flow graph to be fused. By strategically selecting loops to fuse, performance gains can be achieved by eliminating unnecessary transactions with memory and increasing functional unit utilization. This approach has helped us achieve up to 1.9x speedup on average for sufficiently large data flow graphs used in image processing.