A Reduced Description of Transient Stochastic Thermo-Fluid Systems

Monday, December 10, 2018 - 01:00 pm
Speaker: Hessam Babaee, Ph.D. Location: Innovation Center, Room 2277 Dec. 10, 13:00--14:00 Abstract:Highly convective thermo-fluid systems have a difficult phenomenon to predict: transient instabilities. While these instabilities have finite lifetimes, they can play a crucial role either by altering the system dynamics through the activation of other instabilities or by creating sudden nonlinear energy transfers that lead to extreme responses. However, their essentially transient character makes their description a particularly challenging task. We develop a minimization framework that focuses on the optimal approximation of the system dynamics in the neighbourhood of the system state. This minimization formulation results in differential equations that evolve a time-dependent basis so that it optimally approximates the most unstable directions. Several thermo-fluid demonstration cases will be presented that shows the performance of the presented method. Bio: Dr. Hessam Babaee is an expert in the area of hydrodynamic instability, uncertainty quantification, reduced-order modeling and high performance computing. He is currently a tenure-stream Assistant Professor in Swanson School of Engineering at University of Pittsburgh and a Research Scientist in Mechanical Engineering Department at Massachusetts Institute of Technology (MIT). Prior to joining University of Pittsburgh, he was a Postdoctoral Associate at MIT. He received his PhD in Mechanical Engineering and a Masters degree in Applied Mathematics from Louisiana State University both awarded in 2013.

A Flow Feature Detection Framework for Massive Computational Data Analytics

Friday, December 7, 2018 - 01:00 pm
Storey Innovation Center (Room 2277)
Dr. Yi Wang from the Department of Mechanical Engineering, University of South Carolina will give a talk on Friday Dec. 7th in the Storey Innovation Center (Room 2277) from 13:00 - 14:00. Abstract: In this seminar a framework based on the incremental proper orthogonal decomposition (iPOD) and the data mining method to perform integrated analysis on large-scale computational data will be presented for targeted data visualization, discovery, and learning. Four key components will be introduced, including (1) iPOD based on the mean value and the subspace updating method to incrementally reduce data dimensions, decouple the time-averaged and time-varying flow structures, and extract coherent structures and modes in massive Computational Fluid Dynamics (CFD) data; (2) data mining to classify the flow regions of similar dynamic characteristics and identify the candidate and global ROIs (GROIs) for focused analysis; (3) feature detection to capture flow features of interest and determine ultimate ROIs (UROIs); and (4) selective storage and targeted visualization of data in UROIs. Case studies on vortex and shock wave detection that are of significant interest to aerospace and defense applications will be presented to demonstrate the framework. Computational performance of the framework in terms of data volume, reduction ratio, resource usage, and storage requirements will also be discussed. Our quantitative results clearly show that iPOD is able to process large datasets that overwhelm the traditional batch POD leading to 4-16X data reduction in the temporal domain through spectral projection. By data mining 50% to 70% of the spatial domain with high probability of flow feature occurrence is identified as candidate GROIs for efficient, confined feature detection. Key features in the UROI consisting of only 2% to 30% of the original data are successfully captured by our feature detection algorithms, and can be selectively stored and visualized for targeted discovery and learning. In contrast to batch-POD, iPOD reduces physical memory usage by more than 10X and processing time by up to 75% and is far more appropriate for large data analytics. Biography: Yi Wang obtained his B.S. and M.S. in Machinery and Energy Engineering from Shanghai Jiao Tong University, P.R.China in 1998 and 2000, respectively; and his Ph.D. in Mechanical Engineering from the Carnegie Mellon University in 2005. Currently he is an associate professor of mechanical engineering and is the principal investigator (PI) of the Integrated Multiphysics & Systems Engineering Laboratory (iMSEL) at the University of South Carolina. He has served as a PI or a Co-PI on multiple DoD-, MDA-, NASA-, and NIH-funded projects to develop advanced methodologies and techniques in computational and data-enabled science and engineering (CDS&E), including reduced order modeling, data reduction, large-scale and/or real-time data analytics, hierarchical system-level simulation, and system engineering. The applications of these technologies span spacecraft and missile thermal analysis, aeroservoelasticity and aerothermoservoelasticity, massive computational data management, real-time flight load data processing, integrated multi-scale fluidics systems (design, fabrication, and experimentation) for environmental monitoring, biodefense, and regenerative medicine. He has coauthored 4 book chapters, and 80 journal and conference publications. He is also the co-inventor of 5 patents.

Guest Speakers: Scott McNealy and Bob Cooper

Monday, November 26, 2018 - 06:00 pm
Storey Innovation Center (Room 1400)
Nov 26th 6pm EST – 9pm EST M. Bert Storey Engineering and Innovation Center 550 Assembly St, Columbia, SC 29201 (Room 1400) 6:30pm: I don’t think that person requires introductions, but here it is. His name is Scott McNealy. Scott McNealy is an outspoken advocate for personal liberty, small government, and free-market competition. In 1982, he co-Founded Sun Microsystems and served as CEO and Chairman of the Board for 22 years. He piloted the company from startup to legendary Silicon Valley giant in computing infrastructure, network computing, and open source software. 7:30pm: Bob Cooper. CEO of local company Swampfox but with great history for example Bob was the CTO of Conita, a company focused on creating software based “Personal Virtual Assistants” – think Apple Siri but 15 years ago.

Scott McNealy (@ScottMcNealy)

Co-Founder, Former Chairman of the Board, and CEO, Sun Microsystems, Inc. Co-Founder, and Board Member, Curriki Co-Founder, and Executive Chairman of the Board, Wayin Board Member, San Jose Sharks Sports and Entertainment Scott McNealy is an outspoken advocate for personal liberty, small government, and free-market competition. In 1982, he co-Founded Sun Microsystems and served as CEO and Chairman of the Board for 22 years. He piloted the company from startup to legendary Silicon Valley giant in computing infrastructure, network computing, and open source software. Today McNealy is heavily involved in advisory roles for companies that range from startup stage to large corporations, including Curriki and Wayin. Curriki (curriculum + wiki) is an independent 501(c)(3) organization working toward eliminating the education divide by providing free K-12 curricula and collaboration tools through an open-source platform. Wayin, the Digital Campaign CMS platform enables marketers and agencies to deliver authentic interactive campaign experiences across all digital properties including web, social, mobile and partner channels. Wayin services more than 300 brands across 80 countries and 10 industries. Scott McNealy is an enthusiastic ice hockey fan, and an avid golfer with a single digit handicap. He resides in California with his wife, Susan, and they have 4 sons. BA, Harvard, 1976 MBA, Stanford, 1980

Bob Cooper, CEO

Bob Cooper is the CEO of Swampfox Technology. Swampfox specializes in Call Center automation and is the software automating many of the transactions that many of the largest cable, energy and heath care companies in the US. Prior to starting Swampfox, Bob was Chief Architect at Avaya and was in charge of their call center self-service platform offer including Voice/Experience Portal, ICR and OneX Speech. During his time at Avaya this self-service platform grew to become the #1 selling IVR platform in the US. Prior to this Bob was the CTO of Conita, a company focused on creating software based “Personal Virtual Assistants” – think Apple Siri but 15 years ago. Bob holds many patents in the area of computer architecture and voice user interface design. He received his undergraduate and graduate engineering degrees from the University of Florida and teaches as electrical engineering as needed at the University of South Carolina. He’s married and has four children.

Algorithms for Robot Coverage Under Movement and Sensing Constraints

Friday, November 16, 2018 - 03:00 pm
Meeting room 2265, Innovation Center
DISSERTATION DEFENSE Author : Jeremy Lewis Advisor : Dr. Jason O’Kane Date : Nov 16th , 2018 Time : 3:00 pm Place : Meeting room 2265, Innovation Center Abstract This thesis explores the problem of generating coverage paths---that is, paths that pass within some sensor footprint of every point in an environment--for mobile robots. It both considers models for which navigation is a solved problem but motions are constrained, as well for models in which navigation must be considered along with coverage planning because of the robot's unreliable sensing and movements. The motion contraint we adopt for the former is a common constraint, that of a Dubins vehicle. We extend previous work that solves this coverage problem as a traveling salesman problem (TSP) by introducing a practical heuristic algorithm to reduce runtime while maintaining near-optimal path length. Furthermore, we show that generating an optimal coverage path is NP-hard by reducing from the Exact Cover problem, which provides justification for our algorithm's conversion of Dubins coverage instances to TSP instances. Extensive experiments demonstrate that the algorithm does indeed produce path lengths comparable to optimal in significantly less time. In the second model, we consider the problem of coverage planning for a particular type of very simple mobile robot. The robot must be able to translate in a commanded direction (specified in a global reference frame), with bounded error on the motion direction, until reaching the environment boundary. The objective, for a given environment map, is to generate a sequence of motions that is guaranteed to cover as large a portion of that environment as possible, in spite of the severe limits on the robot's sensing and actuation abilities. We show how to model the knowledge available to this kind of robot about its own position within the environment, show how to compute the region whose coverage can be guaranteed for a given plan, and characterize regions whose coverage cannot be guaranteed by any plan. We also describe an algorithm that generates coverage plans for this robot, based on a search across a specially-constructed graph. Simulation results demonstrate the effectiveness of the approach.

Exploring Machine Learning Techniques to Improve Peptide Identification

Wednesday, November 14, 2018 - 03:00 pm
Meeting room 2265, Innovation Center
THESIS DEFENSE Department of Computer Science and Engineering University of South Carolina Author : Fawad Kirmani Advisor : Dr. John Rose Date : Nov 14th , 2018 Time : 3:00 pm Place : Meeting room 2265, Innovation Center Abstract The goal of this work is to improve proteotypic peptide prediction with lower processing time and better efficiency. Proteotypic peptides are the peptides in protein sequence that can be confidently observed by mass-spectrometry based proteomics. One of the widely used method for identifying peptides is tandem mass spectrometry (MS/MS). The peptides that need to be identified are compared with the accurate mass and elution time (AMT) tag database. The AMT tag database helps in reducing the processing time and increases the accuracy of the identified peptides. Prediction of proteotypic peptides has seen a rapid improvement in recent years for AMT studies for peptides using amino acid properties like charge, code, solubility and hydropathy. We describe the improved version of a support vector machine (SVM) classifier that has achieved similar classification sensitivity, specificity and AUC on Yersinia Pestis, Saccharomyces cerevisiae and Bacillus subtilis str. 168 datasets as was described by Web-Robertson et al. [13] and Ahmed Alqurri [10]. The improved version of the SVM classifier uses the C++ SVM library instead of the MATLAB built in library. We describe how we achieved these similar results with much lesser processing time. Furthermore, we tested four machine learning classifiers on Yersinia Pestis, Saccharomyces cerevisiae and Bacillus subtilis str. 168 data. We performed feature selection from scratch, using four different algorithms to achieve better results from the different machine learning algorithms. Some of these classifiers gave similar or better results than the SVM classifiers with fewer features. We describe the results of these four classifiers with different feature sets.

Phylogeny, Ancestral Genome, and Disease Diagnoses Models Constructions using Biological Data

Monday, November 12, 2018 - 12:00 pm
Meeting room 2267, Innovation Center
DISSERTATION DEFENSE Department of Computer Science and Engineering University of South Carolina Author : Bing Feng Advisor : Dr. Jijun Tang Date : Nov 12th , 2018 Time : 12:00 pm Place : Meeting room 2267, Innovation Center Abstract Studies of bioinformatics develop methods and software tools to analyze biological data and provide insight of the mechanisms of biological processes. Machine learning techniques have been widely used by researchers for disease prediction, disease diagnosis, and bio-marker identification. Using machine learning algorithms to diagnose diseases has a couple of advantages. Besides solely relying on the doctors’ experiences and stereotyped formulas, researchers could use learning algorithms to analyze sophisticated, high-dimensional and multimodal biomedical data, and construct prediction/classification models to make decisions even when some information was incomplete, unknown, or contradictory. In this study, we first build an automated computational pipeline to reconstruct phylogenies and ancestral genomes for two high-resolution real yeast whole genome datasets. We further compare the results with recent studies and publications show that we reconstruct very accurate and robust phylogenies and ancestors. We also identify and analyze conserved syntenic blocks among reconstructed ancestral genomes and present yeast species. Next, we analyzed the metabolic level dataset obtained from the positive mass spectrometry of human blood samples. We applied machine learning algorithms and feature selection algorithms to construct diagnosis models of Chronic kidney diseases (CKD). We also identify the most critical metabolite features and study the correlations among the metabolite features and the developments of CKD stages. The selected metabolite features provided insights into CKD early stage diagnosis, pathophysiological mechanisms, CKD treatments and medicine development. Finally, we use deep learning techniques to build accurate Down Syndrome (DS) prediction/screening models based on the analysis of newly introduced Illumina human genome genotyping array. We proposed a bi-stream convolutional neural network (CNN) architecture with nine layers and two merged CNN models, which took two input chromosome SNP maps in combination. We evaluated and compared the performances of our CNN DS predictions models with conventional machine learning algorithms and single-stream CNN models. We visualized the feature maps and trained filter weights from intermediate layers of our trained CNN model. We further discussed the advantages of our method and the underlying reasons for their performances differences.

WiC: Internship preparation, Resume reviews, and LinkedIn headshots

Tuesday, November 6, 2018 - 07:00 pm
Room 2277, IBM Innovation Center/Horizon 2
You are invited to join Women in Computing November event on Tuesday, Nov. 6. Pizza will be provided and everyone - all genders and majors is welcome! Topic: Professional Development When: Tuesday, November 6th, start from 7 pm Where: Room 2277, IBM Innovation Center/Horizon 2 (the building next to Strom Thurmond Fitness Center that has the IBM logo on the side). Main agenda: Internship preparation, Resume reviews, and LinkedIn headshots

Machine Learning Based Disease Gene Identification and MHC Immune Protein-peptide Binding Prediction

Monday, October 29, 2018 - 09:00 am
Meeting room 2267, Innovation Center
DISSERTATION DEFENSE Author : Zhonghao Liu Advisor : Dr. Jianjun Hu Date : Oct. 29th , 2018 Time : 9:00 am Place : Meeting room 2267, Innovation Center Abstract Machine learning and deep learning methods have been increasingly applied to solve challenging and important bioinformatics problems such as protein structure pre- diction, disease gene identification, and drug discovery. However, the performances of existing machine learning based predictive models are still not satisfactory. The question of how to exploit the specific properties of bioinformatics data and couple them with the unique capabilities of the learning algorithms remains elusive. In this dissertation, we propose advanced machine learning and deep learning algorithms to address two important problems: mislocation-related cancer gene identification and major histocompatibility complex-peptide binding affinity prediction. Our first contribution proposes a kernel-based logistic regression algorithm for identifying potential mislocation-related genes among known cancer genes. Our algorithm takes protein-protein interaction networks, gene expression data, and subcellular location gene ontology data as input, which is particularly lightweight comparing with existing methods. The experiment results demonstrate that our proposed pipeline has a good capability to identify mislocation-related cancer genes. Our second contribution addresses the modeling and prediction of human leukocyte antigen (HLA) peptide binding of human immune system. We present an allele-specific convolutional neural network model with one-hot encoding. With extensive evaluation over the standard IEDB datasets, it is shown that the performance of our model is better than all existing prediction models. To achieve further improvement, we propose a novel pan-specific model on peptide-HLA class I binding affinities prediction, which allows us to exploit all the training samples of different HLA alleles. Our sequence-based pan model is currently the only algorithm not using pseudo sequence encoding — a dominant structure-based encoding method in this area. The benchmark studies show that our method could achieve state-of-the-art performance. Our proposed model could be integrated into existing ensemble methods to improve their overall prediction capabilities on highly diverse MHC alleles. Finally, we present a LSTM-CNN deep learning model with attention mechanism for peptide-HLA class II binding affinities and binding cores prediction. Our model achieved very good performance and outperformed existing methods on half of tested alleles. With the help of attention mechanism, our model could directly output the peptide binding core based on attention weight without any additional post- or pre- processing.

ACM Student Code-A-Thon

Friday, October 26, 2018 - 07:00 pm
Swearingen 1D11
ACM is hosting a 24 hour Code-A-Thon on Friday, October 26th in Swearingen 1D11 at 7PM (and also online) . The Code-A-Thon is open to all majors and all skill levels. If you can’t make the opening event, compete online here: http://www.hackerrank.com/usc-acm-fall-2018-145-division http://www.hackerrank.com/usc-acm-fall-2018-146-division http://www.hackerrank.com/usc-acm-fall-2018-240-division http://www.hackerrank.com/usc-acm-fall-2018-350-division Pick the division of the highest CS course you are enrolled in or have taken (out of 145, 146, 240, and 350). If you haven’t taken any of these CS classes, take the 145 division. If you are a CS graduate student, or have already taken 350, compete in the 350 division. You can pick a lower division, but you won’t be able to compete for prizes. Prizes are:
  • 32GB flash drive & wireless keyboard
  • Raspberry Pi 3 B+
  • 100,000 mAh mobile battery
Let me know if you have any questions, James Coman President, ACM University of South Carolina | Class of 2019 ACM Friendface Page

Uncertainty Estimation of Deep Neural Networks

Monday, October 15, 2018 - 02:30 pm
Meeting room 2267, Innovation Center
DISSERTATION DEFENSE Department of Computer Science and Engineering University of South Carolina Author : Chao Chen Advisor : Dr. Gabriel Terejanu Date : Oct. 15th , 2018 Time : 2:30 pm Place : Meeting room 2267, Innovation Center Abstract Normal neural networks trained with gradient descent and back-propagation have received great success in various applications. On one hand, point estimation of the network weights is prone to over-fitting problems and lacks important uncertainty information associated with the estimation. On the other hand, exact Bayesian neural network methods are intractable and non-applicable for real-world applications. To date, approximate methods have been actively under development for Bayesian neural networks, including but not limited to, stochastic variational methods, Monte Carlo dropouts, and expectation propagation. Though these methods are applicable for current large networks, there are limits of these approaches with either under estimation or over-estimation of uncertainty. Extended Kalman filters (EKFs) and unscented Kalman filters (UKFs), which are widely used in data assimilation community, adopt a different perspective of inferring the parameters. Nevertheless, EKFs are incapable of dealing with highly non-linearity, while UKFs are inapplicable for large network architectures. Ensemble Kalman filters (EnKFs) serve as great methodology in atmosphere and oceanology disciplines targeting extremely high-dimensional, non-Gaussian, and nonlinear state-space models. So far, there is little work that applies EnKFs to estimate the parameters of deep neural networks. By considering neural network as a nonlinear function, we augment the network prediction with parameters as new states and adapt the state-space model to update the parameters. In the first work, we describe the ensemble Kalman filter, two proposed algorithms for training both fully-connected and Long Short-term Memory (LSTM) networks, and experiment it with a synthetic dataset, 10 UCI datasets, and a natural language dataset for different regression tasks. To further evaluate the effectiveness of the proposed training scheme, we trained a deep LSTM network with the proposed algorithm, and applied it on five real-world sub-event detection tasks. With a formalization of the sub-event detection task, we develop an outlier detection framework and take advantage of the Bayesian Long Short-term Memory (LSTM) network to capture the important and interesting moments within an event. In the last work, we develop a framework for student knowledge estimation using Bayesian network. By constructing student models with Bayesian network, we can infer the new state of knowledge on each concept given a student. With a novel parameter estimate algorithm, the model can also indicate misconception on each question. Furthermore, we develop a predictive validation metric with expected data likelihood of the student model to evaluate the design of questions.