Current Students and Alumni
I supervise senior theses and doctoral advising for the Claremont Consortium in several areas of applied mathematics and statistics. Please feel free to contact me with any questions.
Undergraduate Students
- Jonathon Briskman, Claremont McKenna College, Randomized Linear Algebra
- Evan Casey, Claremont McKenna College, Distributed recommender systems
- Nathan Falk, Claremont McKenna College, Probability and Economics
- Aparna Sarkar, Pomona College, Compressed Sensing and Human Vision
- Zachary Siegel, Pomona College, Dictionary Learning (co-advised)
Graduate Students
- Casey Johnson, Claremont Graduate University, Topics in Signal Processing
- Anna Ma, Claremont Graduate University, Topic modeling & compressed sensing
- Tina Woolf, Claremont Graduate University, Topics in Signal Processing
- Ran Zhao, Claremont Graduate University, Randomized Linear Algebra
Postdoc
- Guangliang Chen, Claremont McKenna College, postdoctoral researcher
Alumni
- Nathan Lenssen, Claremont McKenna College (2013), Audio Signal Processing and Forecasting [Best Thesis Award 2013]
- Morgan Mayer-Jochimsen, Scripps College (2013), Public Healthcare Evaluation Techniques
- Jing Wen, Pomona College (2013), Spectral Clustering Methods in Finance
Sample thesis topics
- Dimension Reduction -- prerequisites: Math 60, Math 151 or 052 recommended
Is a fast MRI possible? Data in the modern world like the measurements that create an MRI image is so voluminous that storing, viewing, and analyzing it is often extremely difficult both physically and computationally. Dimension reduction addresses this issue by projecting high dimensional data onto a low dimensional space in such a way that the geometry of the data is preserved. This technology can be used in digital photography, medical imaging, computational biology and many more applications. In this project, we will harness the power of randomized dimension reduction to accelerate methods in applications which require high computational costs.
- Randomized Numerical Linear Algebra -- prerequisites: Math 60, Math 151
Overdetermined systems of linear equations appear in many applications ranging from computer tomography (CT) to quantization. Often, the systems are so large that they cannot even be loaded into memory. Iterative methods have been developed to solve such systems by accessing only row of the matrix at a time, making them efficient in practice. The deterministic versions of such algorithms suffer from slow convergence in some cases. However, when the methods are randomized, convergence is provably exponential in expectation. Although it is well known that randomization is extremely useful, it remains an open question to identify which choices of random distributions yield the best convergence. In this project, students would explore the type of random distributions that lead to the best performance. This can be studied both experimentally and analytically.
- Clustering -- prerequisites: Math 60
Clustering methods address the problem of identifying similar observations within a large set of data. For example, from a large collection of medical scans and patient information, can one automatically detect a group of sick patients from the group of healthy? From an employee database, can one identify employess who are likely to leave the company from those who will stay? These are just several examples of the many applications of clustering. Spectral clustering methods can be used to solve such problems using linear algebraic techniques. Students interested in this project can test and analyze spectral clustering methods in a particular application of interest, and/or analyze and improve methods which speed up the clustering process. The latter is an especially important issue for large datasets since traditional spectral clustering methods run in cubic time.
- Principal Component Analysis -- prerequisites: Math 60, programming experience recommended
Principal Component Analysis (PCA) is one of the most widely used statistical tools for data analysis with applications in data compression, image processing, and bioinformatics. As the name suggests, given a large number of variables and observations, PCA detects the variables which explain most of the data, called principal components. Mathematically, given any data matrix, classical PCA finds the best low-rank approximation to that matrix by solving an optimization problem. This problem can be solved using the singular value decomposition (SVD), and still performs well when the matrix is corrupted by small Gaussian errors. However, PCA is extremely sensitive to errors of large magnitude, even when the number of errors in the data is small. This poses a serious problem since in most recent applications large errors commonly occur due to malicious users of a system, failure of equipment, and natural artifacts. Interested students should be comfortable with linear algebra. Students who participate in this project would learn various topics in computational analysis, probability, and statistics, with the goal of using and improving upon new robust PCA methods.