# CMC Math/CS Senior Thesis Topics

The CMC math and CS faculty represent a wide range of research areas, including algebraic topology and knot theory, functional, harmonic, and complex analysis, probability and statistics, numerical analysis, PDEs, compressed sensing, mathematical finance, number theory, discrete geometry, programming languages, and database systems. We supervise senior theses in a wide range of subjects. Below are descriptions of some sample senior thesis topics.

## Sample project descriptions

• Lenny Fukshansky  (number theory, discrete geometry)

Frobenius problem and discrete geometry
- prerequisites: MATH 131 and 171

Suppose you have an infinite supply of coins of N distinct relatively prime positive denominations a_1,..., a_N. It is not hard to see that there exists a maximal amount of change that cannot be given with these coins, and every amount larger than that can be given. This maximal amount is called the Frobenius number of the N-tuple (a_1,...,a_N) and is denoted by g(a_1,...,a_N). The famous Frobenius problem asks to determine the Frobenius number, given N and the N-tuple a_1,...,a_N. A question of this sort first appeared in the lectures of the legendary 19-th century mathematician Georg Frobenius, which is why this problem was named after him. This beautiful combinatorial problem has a surprisingly large number of connections and applications to different branches of mathematics. It turns out that solving the Frobenius problem is very hard from the computational complexity stand-point; in particular, one cannot expect a general formula for the Frobenius number to exist. A large number of mathematicians have investigated the Frobenius problem from algorithmic, number theoretic, geometric, and probabilistic perspectives, and the literature on the subject is vast. In this project, we will concetrate on a fascinating geometric interpretation of the Frobenius number in terms of the covering radius, an important notion coming from discrete geometry - the area of mathematics which studies properties of discrete sets of objects in the space. We will see how this geometric view point leads to a deep understanding of the combinatorial structure.

Well-rounded lattices and optimization problems - prerequisites: MATH 131 and 171

A lattice in the Euclidean space is a discrete co-compact subgroup: it can be viewed as a discrete analogue of a vector space. Lattices are very important objects in number theory, and are also heavily used for applications in discrete optimization and digital communication problems. A particularly important class of lattices, possessing some crucial symmetry properties are the well-rounded lattices. Investigation of the geometric properties and distribution of well-rounded lattices is an important problem, related to a variety of classical discrete optimization topics as well as applied questions, such as performance optimization for networks of radio transmitters. In this project, we will study some basics of the theory of lattices with a view towards well-rounded lattices, their properties, distribution, and connections to optimization problems.
• Deanna Needell (compressed sensing, randomized algorithms, functional analysis, computational mathematics, probability, statistics)

Dimension Reduction -- prerequisites: Math 60, Math 151 or 052 recommended

Is a fast MRI possible? Data in the modern world like the measurements that create an MRI image is so voluminous that storing, viewing, and analyzing it is often extremely difficult both physically and computationally. Dimension reduction addresses this issue by projecting high dimensional data onto a low dimensional space in such a a way that the geometry of the data is preserved. This technology can be used in digital photography, medical imaging, computational biology and many more applications. In this project, we will harness the power of randomized dimension reduction to accelerate methods in applications which require high computational costs.

Randomized Numerical Linear Algebra -- prerequisites: Math 60, Math 151

Overdetermined systems of linear equations appear in many applications ranging from computer tomography (CT) to quantization.  Often, the systems are so large that they cannot even be loaded into memory.  Iterative methods have been developed to solve such systems by accessing only row of the matrix at a time, making them efficient in practice.  The deterministic versions of such algorithms suffer from slow convergence in some cases.  However, when the methods are randomized, convergence is provably exponential in expectation.  Although it is well known that randomization is extremely useful, it remains an open question to identify which choices of random distributions yield the best convergence.  In this project, students would explore the type of random distributions that lead to the best performance.  This can be studied both experimentally and analytically.

Clustering -- prerequisites: Math 60

Clustering methods address the problem of identifying similar observations within a large set of data.  For example, from a large collection of medical scans and patient information, can one automatically detect a group of sick patients from the group of healthy?  From an employee database, can one identify employess who are likely to leave the company from those who will stay?  These are just several examples of the many applications of clustering.  Spectral clustering methods can be used to solve such problems using linear algebraic techniques.  Students interested in this project can test and analyze spectral clustering methods in a particular application of interest, and/or analyze and improve methods which speed up the clustering process.  The latter is an especially important issue for large datasets since traditional spectral clustering methods run in cubic time.

Principal Component Analysis -- prerequisites: Math 60, programming experience recommended

Principal Component Analysis (PCA) is one of the most widely used statistical tools for data analysis with applications in data compression, image processing, and bioinformatics. As the name suggests, given a large number of variables and observations, PCA detects the variables which explain most of the data, called principal components. Mathematically, given any data matrix, classical PCA finds the best low-rank approximation to that matrix by solving an optimization problem. This problem can be solved using the singular value decomposition (SVD), and still performs well when the matrix is corrupted by small Gaussian errors. However, PCA is extremely sensitive to errors of large magnitude, even when the number of errors in the data is small. This poses a serious problem since in most recent applications large errors commonly occur due to malicious users of a system, failure of equipment, and natural artifacts. Interested students should be comfortable with linear algebra. Students who participate in this project would learn various topics in computational analysis, probability, and statistics, with the goal of using and improving upon new robust PCA methods.