Quarterly Theory Workshop: Computational Challenges in Machine Learning
About the Series
- Location: ITW Lecture Hall (1st floor), Ford Motor Company Engineering Design Center (map), Northwestern U, 2133 Sheridan Rd, Evanston, IL 60208.
- Transit: Noyes St. Purple Line (map).
- Parking: Validation for North Campus Parking Garage (map) available at workshop.
- Registration: none necessary.
Talk Schedule on Wednesday Feb 22nd.
8:45 am — 9:15 am: Breakfast
9:15 am – 10:00am: Sanjoy Dasgupta: Embedding and projection
approaches to fast nearest neighbor search.
10:05 am — 10:50am : Moritz Hardt: Gradient Descent Learns Linear
10:50 am – 11:15 am : Break
11:15 am — Noon: Santosh Vempala: Robust Estimation of Mean and Covariance.
12:05pm – 12:50pm : Nathan Srebro: Non-Convex Local Search: Beyond
12:50pm – 2:00 pm : Lunch
Titles and Abstracts
Title: Embedding and projection approaches to fast nearest neighbor search.
Abstract: Nearest neighbor (NN) search is one of the simplest and most enduring methods of statistical estimation. Its efficacy depends upon (1) finding nearest neighbors rapidly and (2) using a distance function that is well-suited to the domain. We present three algorithmic results that address these issues.1. Randomized data structures for exact NN search in Euclidean space
Two of the most popular approaches to fast NN search are locality-sensitive hashing and randomized forests. They are both effective in practice, but only the former has been analyzed carefully. We focus on randomized forests, which subsume hashing schemes but are applied in a somewhat different, data-dependent manner. We show that the probability that they fail to find the nearest neighbor in Euclidean space, for any data set and any query point, is directly related to a simple potential function that captures the difficulty of the point configuration. We then bound this potential function in several situations of interest, for instance when the data lie in a space of bounded doubling dimension.
2. Embeddings into Hilbert space that monotonically transform distances
Most work on NN search has focused on Euclidean distance, while other important distance functions, like L1 distance or KL divergence, have received much less attention. We observe that although many of these distances cannot be embedded into Euclidean space, monotonic transformations of them can. For instance, L1 is not embeddable in L2,
but the square root of L1 distance is — and this is good enough for NN purposes. We show how this can be exploited algorithmically.
3. A randomized tree structure for metric spaces
Finally, we introduce the “two-vantage” tree, a randomized tree data structure for metric spaces. Forests of such trees are seen to be extremely effective for exact nearest neighbor search.
This is joint work with Kaushik Sinha, Xinan Wang, and Zhen Zhai.
Bio: Sanjoy Dasgupta is a Professor in the Department of Computer Science and Engineering at UC San Diego. He received his PhD from Berkeley in 2000, and spent two years at AT&T Research Labs before joining UCSD. His area of research is algorithmic statistics, with a focus on unsupervised and minimally supervised learning. He is the author of a textbook, “Algorithms” (with Christos Papadimitriou and Umesh Vazirani), that appeared in 2006.
Speaker: Moritz Hardt (Google)
Title: Gradient Descent Learns Linear Dynamical Systems
optimizer of the maximum likelihood objective of an unknown linear
time-invariant dynamical system from a sequence of noisy observations
generated by the system. Even though objective function is non-convex,
we provide polynomial running time and sample complexity bounds under
strong but natural assumptions. Linear systems identification has been
studied for many decades, yet, to the best of our knowledge, these are
the first polynomial guarantees for the problem we consider.Joint work with Tengyu Ma and Ben Recht.Short bio: Moritz Hardt is a senior research scientist at Google Brain. After obtaining
a PhD in computer science from Princeton University in 2011, he worked
at IBM Research Almaden on algorithmic principles of machine learning.
Then he moved to Google to join the Foundations of Applied Machine
Learning group where his mission is to build guiding theory and scalable
algorithms that make the practice of machine learning more reliable,
transparent, and effective.Speaker: Santosh Vempala (Georgia Tech)
Title: Robust Estimation of Mean and Covariance
Abstract: We consider the fundamental statistical problem of robustly estimating the mean and covariance of a distribution from i.i.d. samples in n-dimensional space, i.e. in the presence of arbitrary (malicious) noise, with no assumption on the noise. Classical solutions (Tukey point, geometric median) are either intractable to compute efficiently or produce estimates whose error scales as a power of the dimension. In this talk, we present efficient and practical algorithms to compute estimates for the mean, covariance and operator norm of the covariance matrix, for a wide class of distributions. We show that the estimate of the algorithm is higher than the information-theoretic lower bound by a factor of at most the square-root of the logarithm of the dimension. This gives polynomial-time solutions to some basic questions studied in robust statistics. One of the applications is agnostic SVD.
Joint work with Kevin Lai and Anup B. Rao
Speaker: Nathan Srebro (TTI Chicago)
Title: Non-Convex Local Search: Beyond Global Optimality