Python Basics

Monday, March 28, 2022 - 06:00 pm
Innovation Center Room 2277

Women in Computing is hosting its first ever programming workshop! We will be learning the basics of Python! If you have an interest in learning coding, come on out! We hope to do more workshops in other languages in the future so come by and show your interest tonight, March 28th at 6pm in the Innovation Center Room 2277!

If you want to join virtually we will try to simultaneously share screen via Zoom so be sure to join our GroupMe to get access to the Zoom link.

GroupMe: https://groupme.com/join_group/34681325/pIJInQ

Everyone – all genders and majors is welcome!

Big Data Science: Innovations Using Big Data Science to Re-Engage and Retain People with HIV

Friday, March 25, 2022 - 02:20 pm
Swearingen Engineering Center in Room 2A31

Abstract

This study provides and overview of data system and linkage process for people living with HIV in South Carolina. The purpose of the study is developing and identifying best machine-learning based predictive model for HIV medical treatment status using historical data for a comprehensive established data repository. We provide findings from the study thus far.

 

Bio

Banky Olatosi is tenure track assistant professor in the Department of Health Services Policy and Management, at the Arnold School of Public Health, University of South Carolina (UofSC). He is published in peer-reviewed journals and his research interests are in the fields of Big Data Health Analytics, HIV/AIDS, COVID-19 and rural health. He has expertise in the field of Data Analytics and Data Mining, and currently has NIH grant funding in this area. He co-leads the UofSC national big data health science center (BDHSC). He is a Fellow of the American College of Healthcare Executives (FACHE). He is passionate and committed to the improvement of graduate healthcare education. He currently serves as the Chair of the CAHME Accreditation Council and is also a CAHME national board member. He is a UofSC 2021 Breakthrough Research award winner. Banky Olatosi earned his doctorate in Health Services Policy and Management from the University of South Carolina and earned his MPH in Public Health Administration and Policy from the University of Minnesota (Twin Cities). He also holds a master’s degree in biochemistry from the University of Lagos.

Location

In person

Swearingen Engineering Center in Room 2A31

Virtual MS Teams

Time

2:20-3:10pm

Knowledge-infused Learning

Friday, March 25, 2022 - 10:30 am
Seminar Room, AI Institute, 5th Floor

DISSERTATION DEFENSE

(will take place in hybrid fashion; both physical and virtual)

Author : Manas Gaur

Advisor : Dr. Amit P. Sheth

Date : March 25, 2022

Time 10:30 am

Location : Seminar Room, AI Institute, 5th Floor,

    1112 Greene Street (Science and Technology Building)
     Columbia, South Carolina-29208

Virtual Zoom Link

              Meeting ID: 844 013 9296
         Passcode: 12345

Abstract:

In DARPA’s view on three waves of AI, the first wave of AI termed symbolic AI, focused on explicit knowledge. The current second wave of AI is termed statistical AI. Deep learning techniques have been able to exploit large amounts of data and massive computational power to improve human levels of performance in narrowly defined tasks. Separately, knowledge graphs emerged as a powerful tool to capture and exploit an extensive amount and variety of explicit knowledge to make algorithms better understand the content, and enable the next generation of data processing, such as in semantic search. After initial hesitancy about the scalability of the knowledge creation process, the last decade has seen significant growth in developing and applying knowledge, usually in the form of knowledge graphs (KG). Examples range from the use of DBPedia in IBM’s Watson to Google Knowledge Graph in Google Semantic Search, and the application of ProteinBank in AlphaFold, recognized by many as the most significant AI breakthrough, as well as numerous domain-specific knowledge have been applied in improving AI methods in diverse domains such as medicine and healthcare, finance, manufacturing, and defense.


Now, we herald towards the third wave of AI built on what is termed as the neuro-symbolic approach that combines the strengths of statistical and symbolic AI. Combining the respective powers and benefits of using knowledge graphs and deep learning is particularly attractive. This has led to the development of an approach we have called knowledge-infused (deep) learning. This dissertation will advance the currently limited forms of combining the knowledge graphs and deep learning, called shallow and semi-infusion, with a more advanced form called deep-infusion, that will support stronger interleaving of more variety of knowledge at different levels of abstraction with layers in a deep learning architecture.

This dissertation will investigate the knowledge-infusion strategy in two important ways. The first is to infuse knowledge to make any classification task explainable. The second is to achieve explainability in any natural language generation tasks. I will demonstrate the effective strategies of knowledge infusion that bring five characteristic properties in any statistical AI model: (1) Context Sensitivity, (2) Handling Uncertainty and Risk, (3) Interpretable in learning, (4) User-level Explainability, and (5) Transferability across natural language understanding (NLU) tasks. Along with proven methodological contributions in AI made by the dissertation, I will show their applications for open-domain and close-domain NLU.
Furthermore, the dissertation will showcase the utility of incorporating diverse forms of knowledge: linguistic, commonsense, broad-based, and domain-specific. As the dissertation illustrates the success in various domains, achieving state-of-the-art in specific applications, and significant contributions towards improving the state of machine intelligence, I will walk through careful steps to prevent errors arising due to knowledge infusion. Finally, for future directions, I will discuss two exciting areas of research where knowledge infusion would be pivotal to propel machine understanding.

Concurrent identification, characterization, and reconstruction of Protein structure and mixed-mode dynamics from rdc data using redcraft 

Monday, March 21, 2022 - 09:30 am

DISSERTATION DEFENSE

Author : Hanin Omar

Advisor : Dr. Homayoun Valafar

Date : March 21, 2022

Time: 9:30 am

Place: Virtual Teams Link

Abstract

A complete understanding of the structure-function relationship of proteins requires an analysis of their dynamic behaviors in addition to the static structure. However, all current approaches to the study of dynamics in proteins have their shortcomings. A conceptually attractive and alternative approach simultaneously characterizes a protein's structure and its intrinsic dynamics⁠⁠. Ideally, such an approach could solely rely on RDC data-carrying both structural and dynamical information. The major bottleneck in the utilization of RDC data in recent years has been attributed to a lack of RDC analysis tools capable of extracting the pertinent information embedded within this complex source of data.  

Here we present a comprehensive strategy for structure calculation and reconstruction of discrete state dynamics from RDC data based on the SVD method of order tensor estimation. In addition to structure determination, we provide a mechanism of producing an ensemble of conformations for the dynamical regions of a protein from RDC data. The developed methodology has been tested on simulated RDC data with ±1Hz of error from an 83 residue α protein (PDB ID 1A1Z). In nearly all instances, our method reproduced the protein structure, including the conformational ensemble, to within less than 2Å. Based on our investigations, arc motions with more than 30° of rotation are recognized as internal dynamics and are reconstructed with sufficient accuracy.  Furthermore, states with relative occupancies above 20% are consistently recognized and reconstructed successfully. Arc motions with a magnitude of 15° or relative occupancy of less than 10% are consistently unrecognizable as dynamical regions within the context of ±1Hz of error.  

 We also introduce a computational approach named REDCRAFT that allows for uncompromised and concurrent characterization of protein structure and dynamics. We have subjected DHFR (PDB-ID 1RX2), a 159-residue protein, to a fictitious but plausible, mixed-mode internal dynamics model. In this simulation, DHFR was segmented into 7 regions. The two dynamical and rigid-body segments experienced an average orientational modification of 7˚ and 12˚, respectively. Observable RDC data for backbone C'-N, N-H, and C'-H were generated from 102 frames that described the molecular trajectory. The Dynamic Profile generated by REDCRAFT allowed for the recovery of individual fragments with bb-rmsd of less than 1Å and the identification of different dynamical regions of the protein. Following the recovery of fragments, structural assembly correctly assembled the four rigid fragments with respect to each other, categorized the two domains that underwent rigid-body dynamics, and identified one dynamical region for which no conserved structure can be defined. In conclusion, our approach successfully identified dynamical domains, recovery of structure where it is meaningful, and relative assembly of the domains when possible.  

How Smart City Infrastructure & Blockchain can Reduce Harmful Vehicle Emissions

Friday, March 18, 2022 - 02:20 pm
Swearingen Engineering Center in Room 2A31

Abstract

In 2020, Dr. Amari N. Lewis had the opportunity to conduct research at Aalborg University in Denmark. This talk on how smart city infrastructure and Blockchain can reduce harmful vehicle emissions revisits the research discoveries from Dr. Lewis' time in Denmark. As a result, we discovered two methodologies to reduce harmful vehicle emissions through the use of technology and were able to validate our theory through simulation. The first is a Blockchain Emissions Trading System (B-ETS). The second method involves traffic theory, smart infrastructure and dissuasion methods to improve air quality in residential areas.

 

Bio

Amari N. Lewis is currently a Postdoctoral scholar in the Department of Computer Science & Engineering at the University of California San Diego. Her current research is in the area of Computer Science Education studying retention and experiences of students especially focused on students from marginalized populations. Amari earned her PhD in Computer Science from the University of California, Irvine where her research focused on technological advancement in transportation systems. In the last year of her PhD she conducted research in Denmark at Aalborg University. At Aalborg University the research focused on the use of Blockchain in smart infrastructure.

 

Location:

In person

Swearingen Engineering Center in Room 2A31

 

Virtual MS Teams

Time

2:20-3:10pm

Graph Neural Network and Phylogenetic Tree Construction  

Friday, March 18, 2022 - 10:00 am

DISSERTATION DEFENSE

  

Author : Gaofeng Pan

Advisor : Dr. Jijun Tang

Date : March 18, 2022

Time 10:00 am

Place : Virtual (Teams link below)

 

Meeting Link

 

Abstract

Deep Learning had been widely used in computational biology research in past few years.  A great amount of deep learning methods was proposed to solve bioinformatics problems, such as gene function prediction, protein interaction classification, drug effects analysis, and so on; most of these methods yield better solutions than traditional computing methods.  However, few methods were proposed to solve problems encountered in evolutionary biology research. In this dissertation, two neural network learning methods are proposed to solve the problems of genome location prediction and median genome generation encountered in phylogenetic tree construction; the ability of neural network learning models on solving evolutionary biology problems will be explored. 

Phylogenetic tree represents the evolutionary relationships among genomes in intuitive ways, it could be constructed based on genomics phenotype and genomics genotype. The research of phylogenetic tree construction methods is meaningful to biology research. In the past decades, many methods had been proposed to analyze genome functions and predict genome subcellular locations in cell. However, these methods have low accuracy on multi-subcellular localization. In order to improve prediction accuracy, a neural network learning model is defined to extract genome features from genome sequences and evolution position matrix in this research. Experiment results on two widely used benchmark datasets show that this model has significant improvements than other currently available methods on multi-subcellular localization; deep neural network is effective on solving genome location prediction.  

Phylogenetic tree construction based on genomics genotype has more accurate results than construction based on genomics phenotype. The most famous phylogenetic tree construction framework utilizes median genome algorithms to filter tree topology structure and update phylogenetic ancestral genome. Currently, there are several median genome algorithms which could be applied on short genome and simple evolution pattern, however when genome length becomes longer and evolution pattern is complex these algorithms have unstable performance and exceptionally long running time. In order to lift these limitations, a novel median genome generator based on graph neural network learning model is proposed in this research. With graph neural network, genome rearrangement pattern and genome relation could be extracted out from internal gene connection. Experiment results show that this generator could obtain stable median genome results in constant time no matter how long or how complex genomes are; its outstanding performance makes it the best choice in GRAPPA framework for phylogenetic tree construction. 

CASY 2.2

Friday, March 18, 2022 - 08:00 am
1112 Greene St, Columbia, South Carolina 29208

CASY 2.2 is a hybrid physical/virtual conference on the theme of collaborative assistants for the society. The conference will take place on March 18, 2022 virtually over Zoom and physically, in a limited socially distant manner, on the campus on University of South Carolina, Columbia, SC, USA. The event will be free-to-attend and is intended to promote the ethical usage of digital assistants in society for daily life activities.

Please check out the speakers and register using Eventbrite here. Also, see the schedule.

Automata-theoretic approaches to planning in robotics

Wednesday, March 16, 2022 - 01:00 pm
Storey Innovation Center 2267, 550 Assembly St.

DISSERTATION DEFENSE

Automata-theoretic approaches to planning in robotics: combinatorial filter minimization, planning to chronicle, temporal logic planning with soft specifications, and sensor selection for detecting deviations from a planned itinerary

 

Author : Hazhar Rahmani

Advisor : Jason O'Kane

Date : March 16, 2022

Time: 1:00 pm

Place: Storey Innovation Center 2267, 550 Assembly St.

Virtual (Zoom link): https://us02web.zoom.us/j/83006866662

 

Abstract

In this dissertation, we present a collection of new planning algorithms that enable robots to achieve complex goals, beyond simple point-to-point path planning, using automata-theoretic methods, and we consider the filter minimization (FM) problem and a variant of it, filter partitioning minimization (FPM) problem, which aims to minimize combinatorial filters, used for filtering and automata-theoretic planning in systems with discrete sensor data. We introduce a new variant of bisimulation, compatibility, and using this notion we identify several classes of filters for which FM or FPM is solvable in polynomial time, and propose several integer linear programming (ILP) formulations of FM and FPM. Then, we consider a problem, planning to chronicle, in which a robot is tasked with observing an uncertain time-extended process to produce a `chronicle’ of occurrent events that meets a given specification. This problem is useful in applications where we deploy robots to autonomously make structured videos or documentaries from events occurring in an unpredictable environment. Next, we study two variants of temporal logic planning in which the objective is to synthesize a trajectory that satisfies an optimal selection of soft constraints while nevertheless satisfying a hard constraint expressed in linear temporal logic (LTL). We also extend planning to chronicle with the idea of this problem. Then, we consider the problem of planning where to observe the behavior of an agent to ensure that the agent’s execution within the environment fits a pre-disclosed itinerary. This problem arises in a range of contexts including in validating safety claims about robot behavior, applications in security and surveillance, and for both the conception and the (physical) design and logistics of scientific experiments.

Empowering translational data science through large-scale data harmonization

Friday, March 4, 2022 - 02:20 pm

Abstract

Data silos are a major problem in biomedical research. Our data commons and ecosystems framework helps to break down data silos and empower translational research. To do this, data must be harmonized through standardized workflows, references, annotations, and/or mapped to standardized ontologies. I helped lead the design and implementation of a petabyte-scale “omics” data harmonization system utilized in the National Cancer Institute’s Genomic Data Commons. I present some background in data commons, describe our automation system, and provide some highlights of other work my colleagues and I are doing in the Center for Translational Data Science at the University of Chicago. Our work showcases the importance of interdisciplinary collaboration and exciting opportunities for computer science, data science, and biomedical research.

 

Bio

Kyle Hernandez, Ph.D. is a Research Associate Professor of Medicine in the Section of Biomedical Data Science, a Co-PI of the VA Data Commons, and a Manager of Bioinformatics in the Center for Translational Data Science at the University of Chicago. He was a key contributor to the design and development of the large-scale workflow automation system used in the National Cancer Institute Genomic Data Commons (GDC) as well as the development of many of the workflows it uses. His research interests include workflow engines, the genetic architecture of complex phenotypes, the integration of multiple data types, reproducibility, and more recently the complex issues of EHR data curation and extraction. He earned his Ph.D. in Ecology, Evolution, and Population Biology at Purdue University and was an NSF Postdoctoral Fellow at the University of Texas. He joined the University in 2013 and has been affiliated with the Center since 2016.

 

Location:

In person

Swearingen Engineering Center in Room 2A31 

Virtual MS Teams

ACM Code-A-Thon

Friday, February 25, 2022 - 07:00 pm

The bi-annual Association of Computing Machinery Code-A-Thon is this week, February 25th-26th. It is a competition open to current and former USC students and alumni, regardless of field of study. It is divided into four divisions: 145, 146, 240, and 350. Challenges align with the concepts taught at the corresponding CSCE levels. For example lower-level division problems ask participants to use data structures or specific edge-case checking; upper-division might ask participants to process data in O(log n) time.

The ACM Code-A-Thon is a great social event for students to congregate while enjoying pizza, soft drinks, collaboration, and honestly, coding outside of class. It’s also rigorous. The social event starts on Feb 25th at 7pm and lasts for an hour until 8pm when the competition actually starts. The competition lasts for 24 hours. Each division has about four to seven problems with varying point totals that reflect the difficulty of the problem. Points are awarded by the number of test cases a submitted solution passes. Winners are determined by the most points awarded; in the event of a tie at any particular level the tie-break goes to the person who completed the event fastest. 

To make the event more enticing to the public gift cards are awarded to the first, second, and third place finishers at each level to the tune of $100, $75, and $50. Also, in the past instructors have awarded extra credit points to people who participated in the event. ACM will help verify by keeping an attendance sheet with the names of participants and for which instructor they want extra credit. It is ACMs suggestion that should any instructor offer extra credit, they tie the extra credit to points earned by the competing student to avoid a situation where dozens of students sign up to compete but don’t attempt a solution.

Again, this is meant to be collaborative. There will be channels for participants to ask questions and discuss potential solutions, or just throw stuff on the way and see what sticks. We encourage discussion.

Please reach out with any questions you may have about the Code-A-Thon. Blake Seekings (seekingj@email.sc.edu) is the best person to contact.

Students who wish to attend please fill out this form, or get more information here.