Shannon Quinn (4th-year graduate student in the joint Carnegie Mellon-University of Pittsburgh Ph.D. program in computational biology) will give a talk titled "Distributed Spectral Graph Methods for Analyzing Large-Scale Unstructured Biomedical Data" on Monday, March 17, 2014 from 3:30 p.m. to 4:30 p.m at Room 328 Boyd GSRC. Refreshments will be served at 3:00 p.m. in Room 409, Boyd GSRC
The explosive growth of data in biological research, and countless other fields of study, has fueled investigation into distributed pattern recognition techniques. Some of these techniques involve inference over undirected graphs that are too large for analysis on a single machine (millions to billions of vertices). Graphs are a convenient way of organizing this data, and the eigenvalues and eigenvectors of the graph can yield important insights into the evolution of the network and diffusion of information across it. First, we examine smaller networks built from motion parameters derived by recognizing and quantifying the motion depicted in digital videos of ciliary biopsies. Cilia are microscopic hairlike structures that beat in synchronized patterns to move particulates and nutrients in the throat, lungs, kidneys, and brain. Automated recognition of the specific type of ciliary motion is an important problem in clinical diagnostics. Second, we investigate how we could perform this analysis on much larger and more diverse networks through the use of distributed graph algorithms, using the open source Apache Mahout library as a particular example. Finally, we look at these analytics in practice using the Oak Ridge Biosurveillance Toolkit (ORBiT) to monitor a multitude of web services, such as Facebook, Twitter, and Instagram, and observe the dynamics of public health in real-time, identify health-related threats as they emerge, and discover novel spatio-temporal correlations in the evolution and spread of disease.
Shannon is a 4th-year graduate student in the joint Carnegie Mellon-University of Pittsburgh Ph.D. program in computational biology. His research interests include machine learning and computer vision at scale, with emphasis on computational bioimaging for diagnostics and prognostics. He is active in the open source community as a committer for the Apache Mahout project, and has incorporated this work into his research through the development of scalable clustering algorithms. His ongoing collaboration with Oak Ridge National Laboratory in the creation of a biosurveillance framework forms the core of his thesis and future work.
***Shannon Quinn is a Computer Science/Cellular Biology faculty candidate.