COLLOQUIUM Department of Computer Science and Engineering University of South Carolina How to Use Kolmogorov Complexity to Perform (Almost) Content-free Phylogenetics John Rogers Department of Computer Science DePaul University Date: September 13, 2007 Time: 1530-1630 Place: Swearingen 3A75 (Joint CSE/EE Conference Room) Abstract This talk will include a presentation of the results of a paper by M.Li et al. [M. Li, X. Chen, X. Li, B. Ma, P.M.B. Vitanyi, “The Similarity Metric,” IEEE Trans. Inform. Th., 50:12(2004), 3250-3264], where Kolmogorov complexity is used to define the relative distance between two finite strings. Because so much information is represented by strings, Li et al.’s similarity metric finds applications in phylogenetics, linguistic taxonomy, studies of chain letters, and detecting plagiarism in students' C++ programs. Some background on Kolmogorov complexity will be provided along the way, followed by an explanation of how the theory of the similarity metric can be put into practice. Because determining the Kolmogorov complexity of a string is equivalent to solving the halting problem, certain concessions must be made to compute an approximation of the metric but, as we shall see, these concessions still yield a powerful and largely accurate technique that, among other applications, groups species based on their mitochondrial DNA and places natural languages into families based on documents from those languages. John Rogers is an Associate Professor in the Computer Science Department at DePaul University. His research interests include structural complexity, quantum computation, and, more recently, bioinformatics. Dr. Rogers obtained the Ph.D. degree in Computer Science at the University of Chicago with a dissertation entitled “Isomorphisms, separability, and one-way functions” in June 1995.