Friday, March 4, 2022 - 02:20 pm

Abstract

Data silos are a major problem in biomedical research. Our data commons and ecosystems framework helps to break down data silos and empower translational research. To do this, data must be harmonized through standardized workflows, references, annotations, and/or mapped to standardized ontologies. I helped lead the design and implementation of a petabyte-scale “omics” data harmonization system utilized in the National Cancer Institute’s Genomic Data Commons. I present some background in data commons, describe our automation system, and provide some highlights of other work my colleagues and I are doing in the Center for Translational Data Science at the University of Chicago. Our work showcases the importance of interdisciplinary collaboration and exciting opportunities for computer science, data science, and biomedical research.

 

Bio

Kyle Hernandez, Ph.D. is a Research Associate Professor of Medicine in the Section of Biomedical Data Science, a Co-PI of the VA Data Commons, and a Manager of Bioinformatics in the Center for Translational Data Science at the University of Chicago. He was a key contributor to the design and development of the large-scale workflow automation system used in the National Cancer Institute Genomic Data Commons (GDC) as well as the development of many of the workflows it uses. His research interests include workflow engines, the genetic architecture of complex phenotypes, the integration of multiple data types, reproducibility, and more recently the complex issues of EHR data curation and extraction. He earned his Ph.D. in Ecology, Evolution, and Population Biology at Purdue University and was an NSF Postdoctoral Fellow at the University of Texas. He joined the University in 2013 and has been affiliated with the Center since 2016.

 

Location:

In person

Swearingen Engineering Center in Room 2A31 

Virtual MS Teams