Cybersecurity Club First Meeting

Wednesday, September 6, 2017 - 06:00 pm
Swearingen
Come join us for our first meeting of the semester in Amoco Hall! We'll be discussing events throughout the school year, voting on future meeting times, and watching a demonstration on how to turn your pets into mobile hacking machines. More Details

An Application of Natural Language Processing: Analyzing student essays as a big-data project

Friday, September 1, 2017 - 02:20 pm
Swearingen room 2A14
I would like to invite you to attend this week's CSCE 791 seminar. These seminars highlight research being performed in our department and across the world. All CSCE 791 seminars are open to anybody who wishes to attend - not just students registered for the course. Friday, September 1, 2:20 - 3:10 PM Swearingen room 2A14 Speaker: Dr. Duncan Buell, University of South Carolina Abstract: First year students at most large universities take required courses whose purpose is to teach them to write prose essays and make arguments. We have acquired more than 7000 pairs of draft-and-final essays from USC and have been analyzing them. We are not trying to do “machine grading” of essays as an AI project. Rather, we are trying to identify features of writing that can be quantified and thus processed with programs as a big-data analysis. We are interested in the extent to which students revise their draft essays to become final versions. And we are interested in comparing our student writing against other genres of writing. For this last we use the Corpus of Contemporary American English (COCA) as source data. The COCA is a corpus of more than 500 million words of text separated into genres of academic writing, magazine writing, transcripts of spoken English and interviews, and such. Our eventual goal is to situate student writing relative to other genres and thu s to help with improving the pedagogy of teaching writing; knowing what the students are actually writing now is key to knowing how to get them to write formal prose effectively. Programming is done in Python. Part of speech tagging is done using the CLAWS package from the University of Lancaster in the UK. Sentence parsing is done using the package from Dan Jurafsky’s lab at Stanford. Bio: Duncan A. Buell is a Professor in the Department of Computer Science and Engineering at the Unviversity of South Carolina. His Ph.D. is in mathematics from the University of Illinois at Chicago (1976). He was from 2000 to 2009 the department chair at USC, and in 2005-2006 was interim dean. He has done research in document retrieval, computational number theory, and parallel computing, and has more recently turned to digital humanities as one of the emerging “marketplace” applications for computing. He is engaged with First Year English at USC on the analysis of freshman English essays, searching for an understanding of actual student writing in an effort to improve pedagogy for first year English instruction. He has team taught four times with Dr. Heidi Rae Cooley on the presentation of unacknowledged history on mobile devices, and he and Dr. Cooley are actively engaged in ways to go beyond text to fully enable the use of visual media in mobile applications that present human ities content, especially content that might normally remain unacknowledged by institutional authority.

Women in Computing Welcome Meeting

Monday, August 28, 2017 - 06:00 pm
Faculty Lounge at Swearingen 1A03
Welcome to join Women in Computing tonight. We welcome everyone - all genders and majors! The event will be held tonight in Faculty Lounge at Swearingen 1A03, from 7:00 - 8:30 pm. Today’s major agenda is to give an information session for Grace Hopper Conference. Department of Computer Science and Engineering will be sending a group of students to the 2017 Grace Hopper Celebration of Women in Computing which will take place on Wednesday, October 4 through Friday, October 6, in Orlando, Florida. http://ghc.anitaborg.org/. The deadline of application is tonight. We will talk about the application and selection process. The club has invited some previous attendees to attend the meeting and share their experience. More info about WiC.

Ward One App Presentation

Tuesday, April 18, 2017 - 04:30 pm
Booker T. Washington Auditorium (1400 Wheat Street)
It’s that time of year. Students enrolled in the cross-College Critical Interactives class (CSCE 571, FAMS 511, MART 591, FAMS 710, CSCE 790) have been developing the 2017 version of Ward One App, which they will present and demonstrate on Tuesday, April 18, 2017, at 4:30 - 5:30pm in Booker T. Washington Auditorium. We invite you join us for the event. The presentation and demonstration will be followed by Q&A.

Backers and Hackers Demo Day

Thursday, April 13, 2017 - 06:00 pm
Sonoco Pavilion at the Darla Moore School of Business
The Entrepreneurship Club at USC hosts an event called Backers & Hackers, a program that brings computer science and business students, along with Columbia’s entrepreneurial community together to transform app ideas into working mobile applications. Last year at our demo day, we showcased 13 mobile apps, had 100 attendees, 15 sponsors, and 4 investors. Backers & Hackers Demo Day will take place at the Sonoco Pavilion at the Darla Moore School of Business on April 13th at 6:00 PM. Food will be provided. https://tockify.com/incubator/detail/113/1492120800000

Robustness Evaluation for Phylogenetic Reconstruction methods and Evolutionary Models Reconstruction of Tumor Progression

Thursday, April 6, 2017 - 02:40 pm
3A75, Swearingen
DISSERTATION DEFENSE Department of Computer Science and Engineering University of South Carolina Author : Jun Zhou Advisor: Jijun Tang Date : April 6th Time: 2:40 – 4:00 pm Place : 3A75, Swearingen Abstract Over millions of year of evolutionary history, the order and content of the genomes got changed by rearrangements, duplications and losses. There is always a consistent passion to find out what happened and what can happen in the evolutionary process. Due to the great development of various technology, the information about genomes is exponentially increasing, which make it possible figure the problem out. The problem has been shown so interesting that a great number of algorithms have been developed rigorously over the past decades in attempts to tackle these problems following different kind of principles. However, difficulties and limits in performance and capacity, and also low consistency largely prevent us from confidently statement that the problem is solved. To know the detailed evolutionary history, we need to infer the phylogeny of the evolutionary history (Big Phylogeny Problem) and also infer the internal nodes information (Small Phylogeny Problem). The work presented in this thesis focuses on assessing methods designed for attacking Small Phylogeny Problem and algorithms and models design for genome evolution history inference from FISH data for cancer data. During the recent decades, a number of evolutionary models and related algorithms have been designed to infer ancestral genome sequences or gene orders. Due to the difficulty of knowing the true scenario of the ancestral genomes, there must be some tools used to test the robustness of the adjacencies found by various methods. When it comes to methods for Big Phylogeny Problem, to test the confidence rate of the inferred branches, previous work has tested bootstrapping, jackknifing, and isolating and found them good resampling tools to corresponding phylogenetic inference methods. However, till now there is still no system work done to try and tackle this problem for small phylogeny. We tested the earlier resampling schemes and a new method inversion on different ancestral genome reconstruction methods and showed different resampling methods are appropriate for their corresponding methods. Cancer is famous for its heterogeneity, which is developed by an evolutionary process driven by mutations in tumor cells. Rapid, simultaneous linear and branching evolution has been observed and analyzed by earlier research. Such process can be modeled by a phylogenetic tree using different methods. Previous phylogenetic research used various kinds of dataset, such as FISH data, genome sequence, and gene order. FISH data is quite clean for the reason that it comes from single cells and shown to be enough to infer evolutionary process for cancer development. RSMT was shown to be a good model for phylogenetic analysis by using FISH cell count pattern data, but it need efficient heuristics because it is a NP-hard problem. To attack this problem, we proposed an iterative approach to approximate solutions to the steiner tree in the small phylogeny tree. It is shown to give better results comparing to earlier method on both real and simulation data. In this thesis, we continued the investigation on designing new method to better approximate evolutionary process of tumor and applying our method to other kinds of data such as information using high-throughput technology. Our thesis work can be divided into two parts. First, we designed new algorithms which can give the same parsimony tree as exact method in most situation and modified it to be a general phylogeny building tool. Second, we applied our methods to different kinds data such as copy number variation information inferred form next generation sequencing technology and predict key changes during evolution.

Improving Peptide Identification by Considering Ordering Amino Acid Usage

Wednesday, April 5, 2017 - 01:00 pm
Swearingen, 3D05
Thesis Defense Author : Ahmed Al-Qari Advisor : Dr. John Rose Abstract Proteomics has made major progress in recent years after the sequencing of the genomes of a substantial number of organisms. A typical method for identifying peptides uses a database of peptides identified using tandem mass spectrometry (MS/MS). The profile of accurate mass and elution time (AMT) for peptides that need to be identified will be compared with this database. Restricting the search to those peptides detectable by MS will reduce processing time and more importantly increase accuracy. In addition, there are significant impacts for clinical studies. Proteotypic peptides are those peptides in a protein sequence that are most likely to be confidently observed by current MS-based proteomics methods. There has been rapid improvement in the prediction of proteotypic peptides for AMT studies based on amino acid properties such as amino acid content, polarity, charge and hydrophobicity using a support vector machine (SVM) classification approach. Our goal is to improve proteotypic peptide prediction. We describe the development of a classifier that considers amino acid usage that has achieved a classification sensitivity of 90% and specificity 81% on the Yersinia pestis proteome (using 3-AAU). Using Ordered Amino Acid Usage (AAU) feature, we were able to identify a different set of peptides that was not identified by the 35 peptides features that STEP (Webb-Robertson, 2010)[2] have used. This means that Ordered Amino Acid Usage (AAU) feature could complement other features used by STEP to improve identification accuracy. Building on this success, we used STEP (Webb-Robertson, 2010)[2] 35 amino acids features to complement Ordered Amino Acid Usage (AAU) feature in order to enhance the overall accuracy.