Junior Theorists Workshop 2025

Synopsis:

The Chicago Junior Theorists Workshop 2025 is being held jointly by Northwestern University and Toyota Technological Institute at Chicago on December 8th (Monday) and December 9th, 2025 (Tuesday). The workshop will go from 9am on Monday to late afternoon on Tuesday.

The focus of this workshop will be on junior researchers in all areas of theoretical computer science. The workshop is supported by ACM SIGACT, IDEAL, NITMB and SkAI institutes, in addition to TTIC and Northwestern. You can register for the workshop here: https://forms.gle/i2XZxSMBRZMATkes7 by December 4th (Thursday), 2025.  

Logistics

  • Date: Monday, December 8th (TTIC day) & Tuesday, December 9th (Northwestern day@ SkAI/NITMB, downtown Chicago)
  • Organizers: Pooja Kulkarni, Aniket Murhekar, Jeff Xu
  • Contact Email: chicagojtwteam@gmail.com , and/or contact one of the above organizers. 
  • Location for Day 1(Monday): The TTIC day will be hosted at TTIC, on the 5th floor of 6045 S Kenwood Ave, Chicago, IL 60637
    Location for Day 2 (Tuesday):  The Northwestern day will be hosted at the SkAI Institute and NITMB which are both located on the 35th floor of the Hancock Center, 875 N. Michigan Avenue.  Please enter on the Chestnut Street side of the building, 172 E. Chestnut Street.  

Registration link: https://forms.gle/i2XZxSMBRZMATkes7 . Please register by December 4th (Thursday), 2025, for access to the Hancock building on Tuesday (Dec 9th).

Speaker list:

  • John Bostanci (Columbia)
  • Giannis Fikioris (Cornell)
  • Peyman Jabbarzade (U. Maryland)
  • Rhea Jain (UIUC)
  • Esty Kelman (BU/MIT)
  • Yael Kirkpatrick (MIT)
  • Yaowei Long (U Michigan)
  • Cassandra Marcussen (Harvard)
  • Shivam Nadimpalli (MIT)
  • Pranav Nuti (U Chicago)
  • Kalen Patton (Georgia Tech)
  • Prasanna Ramakrishnan (Stanford)
  • Janani Sundaresan (U Waterloo)
  • Zihan Zhang (Ohio State University)
  • Kai Zhe Zheng (MIT)
  • Idan Attias (UIC/TTIC)
  • He Jia (Northwestern)
  • Liren Shan (TTIC)
  • Dravyansh Sharma (TTIC)
  • Vaidehi Srinivas (Northwestern)
  • Chenhao Zhang (Northwestern)

Tentative Schedule (Central Time)

Monday, Dec 8 at TTIC

9:00-9:20 – Breakfast 

9:20-9:30 – Welcome Remarks 

9:30-10:00 – Giannis Fikioris (Cornell) on Learning in Budgeted Auctions with Spacing Objectives

10:00-10:30 – Prasanna Ramakrishnan (Stanford) on How to Appease a Voter Majority

10:30-11:00 – Coffee Break 

11:00-11:30 – Kalen Patton (Georgia Tech) on Online Allocation with Concave, Diminishing-Returns Objectives

11:30-12:00 – Pranav Nuti (U Chicago) on Prophet Inequalities with Cancellation Costs

12:00–1:00 – Lunch

1:00-2:30 – Local Session (TTIC)

  • Liren Shan on Dynamic Algorithm for Explainable k-medians Clustering under Lp Norm
  • Dravy Sharma on On Learning Verifiers for Chain-of-Thought Reasoning
  • Idan Attias on Positive Distribution Shift as a Framework for Understanding Tractable Learning
  • Mahdi Haghifam on TBA

2:30-3:00 – Coffee Break 

3:00-3:30 – Cassandra Marcussen (Harvard) on  Quality Control on Random Graphs in Sublinear Time

3:30-4:00 – Shivam Nadimpalli (MIT) on Super-resolution without assumptions

4:00-4:30 – Coffee Break 

4:30-5:00 – Yael Kirkpatrick (MIT) on  Harnessing Matrix Multiplication for Additive APSP Approximation

5:00-5:30 – Janani Sundaresan (U Waterloo) on Distributed Triangle Detection is Hard in Few Rounds

 

Tuesday, Dec 9 at Northwestern (NITMB/SKAI)

9:00-9:20 – Breakfast 

9:20-9:30 – Welcome Remarks

9:30-10:00 – John Bostanci (Columbia) on Separating QMA from QCMA with a classical oracle

10:00-10:30 – Esty Kelman (BU/MIT) on  Algebraic property testing with online adversaries. 

10:30-11:00 – Coffee Break 

11:00-11:30 – Peyman Jabbarzade (U Maryland) on Better-Than-2 Approximation for Steiner Forest

11:30-noon – Rhea Jain (UIUC) on A Polylogarithmic Approximation for Directed Steiner Forest in Planar Digraphs

Noon-1 pm – Lunch 

1:00pm-2:00pm – Local Session (Northwestern)

  • Vaidehi Srinivas on Online Conformal Prediction with Efficiency Guarantees
  • Chenhao Zhang on Regulation of Algorithmic Collusion from Data
  • He Jia on Moment Methods and Low-Degree Polynomials Fail to Predict Robust Subspace Recovery

2:00pm – 2:30 – Coffee Break 

2:30-3:00 – Kai Zheng (MIT) on Probabilistically Checkable Proofs and their Applications: From Theory to Practice

3:00-3:30 – Zihan Zhang (Ohio State) on Explicit Folded Reed–Solomon and Multiplicity Codes Achieve Relaxed Generalized Singleton Bounds

3:30 – 4:00pm – Yaowei Long (U Michigan) on Fault-Tolerant Distance Oracles and Labeling Schemes with Constant Approximation

Titles and Abstracts

Speaker:   Giannis Fikioris

Title: Learning in Budgeted Auctions with Spacing Objectives

Abstract: In many repeated auction settings, participants care not only about how frequently they win but also about how their winnings are distributed over time.

This problem arises in various practical domains where avoiding congested demand is crucial and spacing is important, such as advertising campaigns.

We initiate the study of repeated auctions with preferences for spacing of wins over time.

We introduce a simple model of this phenomenon, where the value of a win is given by any concave function of the time since the last win.

We study the optimal policies for this setting in repeated second-price auctions and offer learning algorithms for the bidders that achieve $\tilde O(\sqrt{T})$ regret.

 

Speaker:   Cassandra Marcussen

Title: Quality Control on Random Graphs in Sublinear Time

Abstract: Many algorithms are designed to perform well on random instances. However, when running such an algorithm on a specific input, can we trust its output? We define a new class of problems whose formulation allows algorithms to benefit from the randomness of instances while remaining reliable to use on any given input. We call these “quality control” problems, specified by a distribution D and a real-valued quality function Q. We say that an input x is “high quality” if Q(x) is approximately 1, and assume that samples from D are high quality with high probability. The task is to accept x if it is drawn from D and reject x if Q(x) is far from 1.

 

We study this problem in the sublinear setting, where inputs are graphs and the algorithm has access to adjacency matrix queries. Focusing on the case where D is a (sparse) Erdős–Rényi random graph model and Q is proportional to the number of k-cliques in a graph, we show that quality control can be solved superpolynomially faster than the related task of approximating Q in worst-case graphs in the sublinear access model. 

 

Joint work with Ronitt Rubinfeld (MIT) and Madhu Sudan (Harvard).

 

Speaker:  Peyman Jabbarzade

Title: Better-Than-2 Approximation for Steiner Forest

Abstract: The Steiner Forest problem, also known as the Generalized Steiner Tree problem, is a fundamental optimization problem on edge-weighted graphs where, given a set of vertex pairs, the goal is to select a minimum-cost subgraph such that each pair is connected. 

This problem generalizes the Steiner Tree problem, first introduced in 1811, for which the best known approximation factor is 1.39 by [Byrka, Grandoni, Rothvoß, and Sanità, 2010].

The celebrated work of [Agrawal, Klein, and Ravi, 1989], along with refinements by [Goemans and Williamson, 1992], established a 2-approximation for Steiner Forest over 35 years ago. Despite the long-standing importance of this problem, breaking the approximation factor of 2 has remained a major challenge.

In this talk, I present a novel deterministic algorithm that breaks the approximation barrier of 2 for this fundamental problem. As a key component of our approach, we also introduce a novel dual-based local search algorithm for the Steiner Tree problem with an approximation guarantee of 1.943, which is of independent interest.

 

Speaker:  Yael Kirkpatrick

Title: Harnessing Matrix Multiplication for Additive APSP Approximation

Abstract: The All-Pairs Shortest Paths (APSP) is a foundational problem in theoretical computer science. Approximating APSP in undirected unweighted graphs has been studied for many years, beginning with the work of Dor, Halperin and Zwick [SICOMP’01]. The combinatorial additive approximation algorithms shown in this paper remained the state of the art for 2 decades and it was unclear if algebraic tools from the world of matrix multiplication had the potential to speed up this problem.

In recent years, however, we have been able to harness these algebraic tools and give increasingly faster algorithms for computing additive approximation to APSP. In this talk I will present two techniques for ordering or partitioning the graph that allow for the use of algebraic tools and produce faster algorithms for approximating APSP.

 

Speaker:  Rhea Jain

Title: A Polylogarithmic Approximation for Directed Steiner Forest in Planar Digraphs

Abstract: Steiner Forest is fundamental problems in network design, in which given an edge-weighted graph and source-sink vertex pairs, the goal is to find a subgraph of minimum cost that contains a path from each source vertex to its corresponding sink. Steiner Forest and its rooted counterpart, Steiner Tree, admit small constant factor approximation ratios in undirected graphs. On the other hand, the corresponding problems on directed graphs have strong lower bounds on their approximability. In fact, no polylogarithmic approximation algorithm is known for Directed Steiner Tree, and unless P = NP, a polylogarithmic approximation is not possible for Directed Steiner Forest.

Recent results demonstrate that these problems are significantly more tractable when the input directed graph is planar! Friggstad and Mousavi [ICALP ’23] recently obtained a simple and elegant polynomial-time O(log k)-approximation for Directed Steiner Tree in planar digraphs, where k is the number of demand pairs. Inspired by this work, we show that Directed Steiner Forest admits a polylogarithmic approximation in planar digraphs. In this talk, I will given an overview of this algorithm and the key structural properties of planarity that are leveraged in these works. I will also highlight some related work on other network design problems in planar digraphs, including some recent extensions to more complex settings.

 

Speaker:  Prasanna Ramakrishnan

Title: How to Appease a Voter Majority

Abstract: In 1785, Condorcet established a frustrating property of elections and majority rule: it is possible that, no matter which candidate you pick as the winner, a majority of voters will prefer someone else. You might have the brilliant idea of picking a small set of winners instead of just one, but how do you avoid the nightmare scenario where a majority of the voters prefer some other candidate over all the ones you picked? How many candidates suffice to appease a majority of the voters? In this talk, we will explore this question. Along the way, we will roll some dice — both because the analysis involves randomness and because of a connection to the curious phenomenon of intransitive dice, that has delighted recreational and professional mathematicians alike ever since Martin Gardner popularized it in 1970.

 

Based on joint work with Moses Charikar, Alexandra Lassota, Adrian Vetta, and Kangning Wang.

 

Speaker: Yaowei Long

Title: Fault-Tolerant Distance Oracles and Labeling Schemes with Constant Approximation

Abstract: A fault-tolerant distance labeling scheme assigns a label to each vertex and edge of an undirected weighted graph $G$ with $n$ vertices so that, for any edge set $F$ of size $|F| \leq f$, one can approximate the distance between $p$ and $q$ in $G \setminus F$ by reading only the labels of $F \cup \{p,q\}$. For any $k$, we present a deterministic polynomial-time scheme with $O(k^{4})$ approximation and $\tilde{O}(f^{4}n^{1/k})$ label size. This is the first scheme to achieve a constant approximation while handling any number of edge faults $f$, resolving the open problem posed by Dory and Parter [DP21]. All previous schemes provided only a linear-in-$f$ approximation [DP21, LPS25].

Our labeling scheme directly improves the state of the art in the simpler setting of distance sensitivity oracles. Even for just $f = \Theta(\log n)$ faults, all previous oracles either have super-linear query time, linear-in-$f$ app3roximation [CLPR12], or exponentially worse $2^{\poly(k)}$ approximation dependency in $k$ [HLS24].

 

Speaker:  John Bostanci

Title: Separating QMA from QCMA with a classical oracle

Abstract: We construct a classical oracle proving that, in a relativized setting, the set of languages decidable by an efficient quantum verifier with a quantum witness (QMA) is strictly bigger than those decidable with access only to a classical witness (QCMA). The separating classical oracle we construct is for a decision problem we coin spectral Forrelation — the oracle describes two subsets of the boolean hypercube, and the computational task is to decide if there exists a quantum state whose standard basis measurement distribution is well supported on one subset while its Fourier basis measurement distribution is well supported on the other subset. This is equivalent to estimating the spectral norm of a “Forrelation” matrix between two sets that are accessible through membership queries.

Our lower bound derives from a simple observation that a query algorithm with a classical witness can be run multiple times to generate many samples from a distribution, while a quantum witness is a “use once” object. This observation allows us to reduce proving a QCMA lower bound to proving a sampling hardness result which does not simultaneously prove a QMA lower bound. To prove said sampling hardness result for QCMA, we observe that quantum access to the oracle can be compressed by expressing the problem in terms of bosons — a novel “second quantization” perspective on compressed oracle techniques, which may be of independent interest. Using this compressed perspective on the sampling problem, we prove the sampling hardness result, completing the proof.

Joint work with Jonas Haferkamp, Chinmay Nirkhe, and Mark Zhandry.

 

Speaker: Kalen Patton

Title: Online Allocation with Concave, Diminishing-Returns Objectives

Abstract: Online resource allocation problems are central challenges in economics and computer science, modeling situations in which n items arriving one at a time must each be immediately allocated among m agents. In such problems, our objective is to maximize a monotone reward function f(x) over the allocation vector x, which describes the amount of each item given to each agent. In settings where f is concave and DR-submodular (i.e. has monotone decreasing gradient), several lines of work over the past two decades have had great success designing constant-competitive algorithms, including the foundational work of Mehta et al. (2005) on the Adwords problem and many follow-ups. Notably, while a greedy algorithm is 1/2-competitive in such settings, these works have shown that one can often obtain a competitive ratio of (1-1/e) in a variety of settings when items are divisible (i.e. allowing fractional allocations). However, prior works have thus far used a variety of problem-specific techniques, leaving open the general question: Does a (1-1/e)-competitive fractional algorithm always exist for online resource allocation problems with concave, DR-submodular objectives?

 

In this work, we answer this question affirmatively, thereby unifying and generalizing prior results for special cases. Our algorithm is one which makes continuous greedy allocations with respect to an auxiliary objective U(x). Using the online primal-dual method, we show that if U satisfies a “balanced” property with respect to f, then one can bound the competitiveness of such an algorithm. Our crucial observation is that there is a simple expression for U which has this balanced property for any f, yielding our general (1-1/e)-competitive algorithm.

 

Speaker: Kai Zheng

Title: Probabilistically Checkable Proofs and their Applications: From Theory to Practice

Abstract: Probabilistically Checkable Proofs (PCPs) are a remarkable kind of proof that can be verified by inspecting only a constant number of randomly chosen locations—sometimes as few as even two! Since their discovery in the 1990s, PCPs have become central to theoretical computer science, with far-reaching applications in both theory and practice. In this talk, I will give an overview of some of my previous results on PCPs and highlight their relevance to these applications, namely in the areas of hardness of approximation and cryptographic SNARKs.

 

Speaker:  Zihan Zhang

Title: Explicit Folded Reed–Solomon and Multiplicity Codes Achieve Relaxed Generalized Singleton Bounds

Abstract: List-decodable codes play a crucial role in both theoretical and practical applications. By loosening the constraint of unique decoding — requiring only that the original codeword be among a small set of candidates — they can withstand nearly twice as many errors. This enhanced error resilience is fundamental to applications such as group testing and compressed sensing. Beyond error correction, their well-structured mathematical properties enable broader applications in fields like pseudorandomness, computational complexity, and cryptography.

A central and fundamental question in coding theory is what kinds of codes are optimally list-decodable. Previously, all known constructions either relied on randomness or required significantly larger list sizes. In this talk, I will present recent progress demonstrating the first explicit codes that are list-decodable to capacity with the optimal list size. More concretely, I will show that any “appropriate” folded Reed-Solomon codes and multiplicity codes are almost optimally list-decodable, fully resolving an open problem of Guruswami–Rudra (STOC 2006). I will discuss the high-level ideas behind these constructions and also mention many directions for future research. These results are based on joint work with Yeyuan Chen.

 

Speaker: Shivam Nadimpalli

Title: Super-resolution without assumptions

Abstract: The problem of super-resolution, roughly speaking, is to reconstruct an unknown signal to high accuracy, given (potentially noisy) information about its low-degree Fourier coefficients. Classical results give positive answers, but only under strong modeling assumptions—typically requiring that the signal is a sum of well-separated point sources.

In this work we analyze a very general version of the super-resolution problem by considering *arbitrary* high-dimensional signals over the torus. We impose no separation conditions between point sources, and do not assume that the signal is even a finite combination of point sources. This raises a natural question: what can be said about super-resolution in such a general setting?

We give two sets of results for two complementary notions of reconstruction: (i) reconstruction w.r.t. Wasserstein distance, and (ii) reconstruction with respect to a new notion of “heavy hitter” distance. In both settings, we obtain essentially matching upper and lower bounds on the number of Fourier coefficients required for accurate reconstruction. A key technical ingredient in our analysis is the use of extremal polynomial constructions from theoretical computer science.

Speaker: Janani Sundaresan

Title: Distributed Triangle Detection is Hard in Few Rounds

Abstract: In the CONGEST model, $n$ vertices with information only about their neighbourhoods are allowed to communicate to each other over the edges of the input graph. Communication happens in synchronous rounds with a bandwidth of $O(log n)$. We prove that detecting a triangle in this model requires $Omega(log log n)$ rounds. Prior to our work, the only lower bound was that at least two rounds are necessary.

It is known that standard communication complexity arguments that have been used to get lower bounds in the CONGEST model in the past are incapable of giving any meaningful multi-round lower bounds for this problem. Our main contribution is a new information-theoretic technique that combines classical round elimination arguments of communication complexity with the point-to-point communication aspects of distributed networks and can be of independent interest.

Joint work with Sepehr Assadi.

 

Speaker: Pranav Nuti

Title: Prophet Inequalities with Cancellation Costs

Abstract: Most of the literature on online algorithms focuses on settings with irrevocable decisions, where once a decision is made upon the arrival of an input, it cannot be changed. A canonical example is the classic prophet inequality problem, where a sequence of independent random variables with known distributions is revealed one by one, and a decision-maker must decide when to stop and accept the current variable in order to maximize the expected value of their irrevocable pick. We study “”prophet inequalities with cancellations”” under a linear cancellation cost model. In this model, after accepting a random variable, one may later discard it to accept another variable, paying a cancellation cost of f times the discarded variable, where f > 0 is a given parameter. The goal is to maximize the expected net reward, which is the value of the final accepted variable minus the total cancellation cost. In this talk, I will explain some of the ideas behind our main result, which is a simple and polynomial-time policy that achieves the optimal competitive ratio for this problem. Importantly, our policy is order-agnostic, meaning that it only needs to know the set of distributions and not the order in which the random variables arrive.

 

Speaker: Esty Kelman

Title: Algebraic property testing with online adversaries. 

Abstract: Motivated by the right to be forgotten in regulations like GDPR, Kalemaj, Raskhodnikova, and Varma (ITCS 2022 and Theory of Computing 2023) initiated the study of sublinear algorithms that work with continuously and adversarially degrading access to the input. To capture such situations, KRV proposed the online manipulation-resilient model, where after answering each query of the algorithm, the adversary erases t input values for some parameter t. They studied fundamental property testing problems, including linearity and quadraticity of Boolean functions and sortedness of sequences in this model. In particular, they showed that if t is constant, then both linearity and quadraticity can be tested optimally (with the same asymptotic query complexity as in the standard model without an adversary) while sortedness is not testable in their online model, even when t = 1. In this talk, we will start by defining the framework of property testing and the problem of linearity testing of multivariate functions over \mathbb{F}_2.  We will then present the online manipulation-resilient testing model and discuss how to overcome the new challenges that come with it. Then, we will explain how to extend the classical linearity test by Blum, Luby, and Rubinfeld (Journal of Computer and System Sciences 1993) in a way that allows us to design a simple tester that is optimal for the online model even when t is large. Our tester for the online model also serves as a new simple tester that are optimal in the standard model without adversaries.  We will discuss what is currently known and what are the open questions for other algebraic properties: low degree testing and homomorphism testing for general groups as well as locally testable codes. This talk is based on multiple joint works with Vipul Arora, Omri Ben-Eliezer, Ephraim Linder, Uri Meir, and Sofya Raskhodnikova.

 

Speaker: He Jia 

Title: Moment Methods and Low-Degree Polynomials Fail to Predict Robust Subspace Recovery

Abstract: The low-degree polynomial framework has been highly successful in predicting computational versus statistical gaps for high-dimensional problems in average-case analysis and machine learning. This success has led to the low-degree conjecture, which posits that this method captures the power and limitations of efficient algorithms for a wide class of high-dimensional statistical problems. We identify a natural and basic statistical estimation problem in $\R^n$ where all the low-degree moments match up to degree $k=\widetilde{O}(\sqrt{\log n})$, and the low-degree polynomial method up to the same degree fail to predict its computational tractability.

    Our problem is a special case of the well-studied robust subspace recovery problem. In contrast to the lower bounds, we give a simple, robust, and efficient algorithm that solves the problem (and highly noisy variants of it), leveraging anti-concentration properties of the construction. Our results suggest that the moment-based methods, and the low-degree polynomial methods fail to capture algorithmic techniques based on anti-concentration, challenging its universality as a predictor of computational barriers.

 

Speaker: Vaidehi Srinivas

Title: Online Conformal Prediction with Efficiency Guarantees

Abstract: Conformal prediction is the problem of generating prediction sets (confidence intervals).  Conformal methods are widely used in practice to address unreliability of black box models, and for uncertainty quantification.  We study this problem in a novel online formulation, which explicitly optimizes for efficiency (size of the confidence sets).  We show that this problem admits very different guarantees than online learning, which can be thought of as its unconstrained counterpart.

 

Speaker: Idan Attias

Title: Positive Distribution Shift as a Framework for Understanding Tractable Learning

Abstract: We study a setting where the goal is to learn a target function f (x) with respect to a target distribution D(x), but training is done on i.i.d. samples from a different training distribution D′(x), labeled by the true target f (x). Such a distribution shift (here in the form of covariate shift) is usually viewed negatively, as hurting or making learning harder, and the traditional distribution shift literature is mostly concerned with limiting or avoiding this negative effect. In contrast, we argue that with a well-chosen D′(x), the shift can be positive and make learning easier – a perspective we call Positive Distribution Shift (PDS). Such a perspective is central to contemporary machine learning, where much of the innovation is in finding good training distributions D′(x), rather than changing the training algorithm. We further argue that the benefit is often computational rather than statistical, and that PDS allows computationally hard problems to become tractable even using standard gradient-based training. We formalize different variants of PDS, show how certain hard classes are easily learnable under PDS, and make connections with membership query learning.

 

Speaker: Liren Shan

Title: Dynamic Algorithm for Explainable k-medians Clustering under Lp Norm

Abstract: We study the problem of explainable k-medians clustering introduced by Dasgupta, Frost, Moshkovitz, and Rashtchian (2020). In this problem, the goal is to construct a threshold decision tree that partitions data into k clusters while minimizing the k-medians objective. These trees are interpretable because each internal node makes a simple decision by thresholding a single feature, allowing users to trace and understand how each point is assigned to a cluster. 

We present the first algorithm for explainable k-medians under Lp norm for every finite p >=1. We show how to implement our algorithm in a dynamic setting. The dynamic algorithm maintains an explainable clustering under a sequence of insertions and deletions, with small amortized update time and recourse, making it suitable for large-scale and evolving datasets.

 

Speakers: Dravyansh Sharma 

Title: On Learning Verifiers for Chain-of-Thought Reasoning

Abstract: Chain-of-Thought reasoning has emerged as a powerful approach for solving complex mathematical and logical problems. However, it can often veer off track through incorrect or unsubstantiated inferences. Formal mathematical reasoning, which can be checked with a formal verifier, is one approach to addressing this issue. However, currently LLMs are simply not good enough to solve complex problems in a formal way, and even just formalizing an informal problem statement can be challenging. If you have tried to use an LLM to generate a mathematical proof, you have likely observed that they can make unexpected critical mistakes, and have possibly manually checked their “proofs” carefully line by line. Can we automate this tedious process of verifying proofs written in natural language automatically, and thereby improve the reliability of Chain-of-Thought reasoners? Formally, given a problem statement and step-by-step solution in natural language, the aim of the verifier is to output [Yes] if the reasoning steps in the solution are all valid, and [No] otherwise along with the first incorrect step. In this work we give a formal PAC-learning framework, and propose and analyze several natural verification goals, at different levels of strength. We provide sample complexity upper and lower bounds for learning verifiers satisfying these goals.

 

Speaker: Chenhao Zhang

Title: Regulation of Algorithmic Collusion from Data

Abstract: Algorithms are increasingly being used to price goods and services in competitive markets. Many recent papers have shown that in various settings, some configurations of certain pricing algorithms can find and maintain supra-competitive prices from repeated market interactions. We give a definition of plausible algorithmic non-collusion for pricing algorithms. On one hand, the definition allows sellers to utilize useful side information that may be correlated with supply and demand. On the other hand, it allows a regulator to empirically audit algorithms by applying a statistical test to the data they collect. We provide an analysis of the statistical complexity of such an audit, i.e., how much data is sufficient for the test of non-collusion to be accurate. We further discuss the unrelaxability of the definition even when there is no side information, as well as the role of the regulator’s knowledge of the seller’s cost in the audit.

2025 Chicago Junior Theorists Workshop (Call for Nominations)

We seek nominations of outstanding final-year Ph.D. students and postdocs to attend and present their recent research at the 2025 Chicago Junior Theorists Workshop (jointly in person at Northwestern University and Toyota Technological Institute at Chicago) on December 8-9, 2025 (Monday and Tuesday). The workshop visits broad themes in theoretical computer science over the two days, and we expect to invite fifteen speakers.

Faculty advisors are requested to nominate a student or postdoc (at most one per nominator) by Friday, November 7, 2025 using the link below.  Nominations should include (a) a brief nomination statement and (b) the paper that the nominee would present if selected.   This form should be filled in by the nominator.   Students or junior researchers who would like to be considered should ask their advisor or mentor to submit the nomination.

Nomination Form 

Any previously written letter of recommendation for the nominee is sufficient, it does not need to be explicitly tailored to this call for nominations.  Nomination letters will be kept in confidence.

We anticipate receiving more nominations that we can accept and will consider merit, breadth, diversity, and random coins in determining the program. In particular, we regret that the workshop will probably not be able to accommodate all outstanding nominees.

The Junior Theorists Workshop was part of the Quarterly Theory Workshop series at Northwestern. Details about the previous Junior Theorists Workshop can be found here.

Remote teaching on gather.town

By Jason Hartline and Aravindan Vijayaraghavan, Northwestern University.

Screenshot from the podium of our gather.town classroom.

Due to the pandemic, Northwestern computer science courses for the Fall of 2020 were taught remotely. We co-taught our undergraduate Theory of Computation course in a flipped format on gather.town. It was a fantastic experience. Class time was much more interactive than the Zoom classes we had both taught previously. Also, the students liked that they could interact with other students, and the change (versus yet another Zoom class). It went well and we plan to repeat the experience with our winter courses.

The rough details are as follows.

Flipped-class format

The course was taught in the flipped-class format. We had pre-recorded lectures that the students had to watch before each class. Class time, on the other hand, was spent discussing concepts and working through exercises in small groups.

Videos and Exercises

The model for pre-recorded video was three 15-minute videos, though in practice it was more often two 30-minute videos. Each video had accompanying exercises (implemented using the Canvas Quiz feature). It was recommended for students to interleave the exercise questions with the videos as some exercises were designed to reinforce concepts so as to make subsequent videos easier to understand. They were encouraged to work together on these exercises.

Class on Gather.town

The class time was entirely on Gather.town. The classroom was organized with cabaret seating layout. Students virtually sat at four-person tables (though the capacity was not a hard limit). We gave presentations from a podium at the front of the room and students could ask questions from microphones in the middle of the room. Both the tables and the podium and microphone were enabled by gather.town video chat.

replica of our classroom is available for self-guided tours. It is recommended to bring a colleague. This classroom was provided to us for beta-testing by virtualchair.net and similar ones are now available from them or you can build your own on gather.town. (Full disclosure: Jason Hartline is a cofounder of virtualchair.net.)

The 80-minute class time was split into two parts. The first half of the class comprised a recap of concepts from the videos and related discussion, and the second half had the students working in groups on a homework-style problem.

Class Part I: Discussion

For the class discussion, we started with a slide of 5-6 discussion questions. This slide was screen-shared from a podium in the virtual classroom. As students joined the class they were encouraged to begin discussing these questions with other students at their tables. After ten minutes we led a discussion of the questions at the podium, encouraging students to chime in with answers from their discussion groups. Our virtual classroom had two microphones among the cabaret seats that students could use to address the class. We also encouraged students to bring up any questions they had.

This part of the class was quite interactive, and it would also give us a sense of how well the students understood the material, and enabled us to reemphasize material accordingly.

Class Part II: Problem Solving

The second half of class was reserved for student problem solving in groups (at their tables). During the problem-solving session we would join tables of students to answer questions, help talk them through issues, and ensure that they were making progress.

Other virtual interactions

The gather.town space had two additional rooms: a study hall and an office hours room. The study hall featured shared whiteboards and was a place where students could meet up for discussions and group work (homework problems were assigned to students in groups of two). The course staff conducted office hours in the office hours room.

Meeting on gather.town was very convenient and meetings of the course staff were also conducted in the virtual office.

Video content in watch parties.

We scheduled video content to be played in watch parties for students to view together the night before class. However, perhaps due to initial technical difficulties, this feature was not utilized by the students.

Difficulties

The following were the main difficulties we encountered. (Configuring the space was fairly easy with the virtualchair.net automation.)

  1. It was slightly awkward that we could not leave our screenshare at the podium at the same time as we joined group discussions at the tables. This could be addressed by logging into gather.town twice and using one login for screensharing and the other for discussions with students.
  2. We did not establish a video-on policy and we regret it. While we wanted to respect student privacy, students should be fully engaged in discussions and full engagement warrants videos being on. Moreover, we attempted to grade student participation, but it was difficult to know who is talking when many of the students had their video off.
  3. Gather.town does not have a simple mechanism for keeping track of participation of students. Our process was manual and difficult.

Student Feedback

The following quotes from student the student course evaluations that pertain to the flipped format and remote technology. Students were generally quite positive.

  • “The flipped format worked really well for this material; it was really valuable to be able to discuss practice exercises with our peers during class.”

  • “Gather.town was a fantastic choice and made me really look forward to attending this class online. Getting to talk and solve problems with teammates really helped me consolidate ideas. (And it was also really nice to be able to socialize a bit.) The flipped classroom style worked really well.”

  • “After the first few weeks, this was certainty my best class in terms of adapting to remote; gather.town discussions were really great (so great my table often stayed after class to continue them!)”

  • “I thought that the gather.town format was an excellent decision. Being able to discuss course topics with other students in the class definitely helped me consolidate ideas. The recorded video lectures were high quality, and the professors both did a great job leading discussion in class.”

  • “It was done on gather.town, which was a bit rocky the first few weeks but got better by the end; really enjoyed discussing with other people about the exercises (wish more time was spent on them actually).”

  • “The practice exercises with our peers were really helpful. Most of the video lectures were clear enough, but being able to discuss points of confusion with classmates was a great way to clear up questions.”

Conclusions

Overall it was a fantastic experience that we are looking to refine in subsequent course offerings. Our virtual gather.town space was for our class only, but it would be natural to use the same space for multiple classes offered within the same department and doing so might encourage more student meetings on the platform. Our idea of watch parties for students to watch videos together needs further adjustments and testing.

Northwestern papers at FOCS 2020

The Northwestern CS Theory group had three papers at the 61st Annual IEEE Symposium on Foundations of Computer Science (FOCS 2020), which was recently held virtually.

Northwestern Economics PhD student Modibo Camara, CS PhD student Aleck Johnsen, and Prof. Jason Hartline co-authored a paper “Mechanisms for a No-Regret Agent: Beyond the Common Prior“.  The paper analyzes a broad class of Principal-Agent games in economics that normally depend on common knowledge of a precise distribution over an unknown, payoff-relevant input, using instead online learning in repeated game play.  A key idea of the paper which describes (possibly externally-informed) Agents behaviorally with only a no-regret property is to extend previous definitions for “regret” to consider counterfactual sequences of the repeated play. Link to the talk.

Northwestern CS PhD students Yingkai Li and Aleck Johnsen, and Prof. Jason Hartline co-authored a paper “Benchmark Design and Prior Independent Optimization“.  The paper introduces a formal approach to the study of benchmark design, as a parameter of worst case algorithm design to be optimized.  Incorporating first economic properties to justify and measure benchmarks, the main result of the paper shows that benchmark design is equivalent to algorithm design when inputs are drawn independently from a common-but-unknown distribution.  Another main result solves a longstanding open question in 2-agent revenue auction design, which further serves as application for the benchmark design result. Link to the talk.

Prof. Aravindan Vijayaraghavan co-authored a paper “Scheduling Precedence-Constrained Jobs on Related Machines with Communication Delay” with Biswaroop Maiti (Northeastern), Rajmohan Rajaraman (Northeastern), David Stalfa (Northeastern), and Zoya Svitkina (Google). The paper studies the problem of scheduling n precedence-constrained jobs on m uniformly-related machines in the challenging setting where we have to account for communication delays. Communication delay is the amount of time that must pass between the completion of a job on one machine and the start of any successor of that job on a different machine. The paper shows both algorithmic results and lower bounds. This includes the first polylogarithmic factor approximation algorithm for this problem, superconstant integrality gaps, and bounding the advantage of duplication in these settings. Link to the talk.

Seeking nominations for 2020 Junior Theorists Workshop

We are seeking nominations for outstanding final-year Ph.D. students or postdocs to present their recent research at the 2020 Junior Theorists Workshop on December 17 and 18. The workshop will visit broad themes in theoretical computer science and we expect to invite eight speakers. Due to the pandemic, the workshop will be virtual this year.

If you would like to nominate an outstanding student or postdoc from your institution, please submit the paper (URL or pdf) the speaker would likely present and a brief nomination statement here by Nov. 30. If you already have a letter of recommendation for the nominee that includes a discussion of the paper, that would be sufficient. Nomination letters will be kept in confidence. There is no need to check the availability and interest of the nominee before nominating.

We anticipate receiving more nominations than we can accept and will consider merit, breadth, diversity, and random coins in determining the program. In particular, we regret that the workshop will probably not be able to accommodate all outstanding nominees.

The Junior Theorists Workshop is part of our Quarterly Theory Workshop series and is in its third year. Details about last year’s Junior Theorists Workshop and other workshops in the Quarterly Theory Workshop series can be found on the events page.

NSF funds Institute for Data, Econometrics, Algorithms, and Learning

As part of the HDR TRIPODS program, NSF has funded the Institute for Data, Econometrics, Algorithms, and Learning (IDEAL).  The institute is codirected by Prof. Hartline and Prof. Vijayaraghavan with key programs being organized also by Prof. Khuller and Prof. Makarychev.  It is a collaboration between Northwestern, Toyota Technology Institute, and University of Chicago bridges faculty in CS, Economics, Statistics, Electrical Engineering, and Operations Research.   See the news release by McCormick.

New additions to Theory group

The Northwestern Theory group is excited to announce that we have several folks joining us at the beginning of this academic year! We have Hedyeh Beyhaghi as a postdoc joining from Cornell, Shravas Rao as a postdoc joining from NYU, Sumedha Uniyal as a visiting postdoc from Aalto University, Finland; Pattara Sukprasert as a new grad student transferring from the University of Maryland College Park; Saba Ahmadi and Sheng Yang, visiting grad students from the University of Maryland College Park.

Hedyeh Beyhaghi. Dr. Beyhaghi’s research interests are in Algorithm Design, with an emphasis on Algorithmic Game Theory and Mechanism Design. Her research mainly focuses on Auction Design, Online Stochastic Optimization, and Matching Markets. She obtained her Ph.D. at Cornell University, under the supervision of Eva Tardos.This text is only to create extra space for good indentation. Kudos to you if you found this text! Have a nice day. This text is only to create extra space for good indentation. Kudos to you if you found this text! Have a nice day.

 

Shravas Rao. Dr. Rao’s research interests are in theoretical computer science, with an emphasis on derandomization and pseudorandomness. He completed his Ph.D. at New York University advised by Oded Regev.This text is only to create extra space for good indentation. Kudos to you if you found this text! Have a nice day. This text is only to create extra space for good indentation. Kudos to you if you found this text!

 

Pattara Sukprasert. Pattara is a third-year Ph.D. student advised by Prof. Samir Khuller. He transferred from the University of Maryland College Park, where he spent his first two Ph.D. years and received a Master’s degree. Before that, he lived mostly in Thailand and got another Master’s degree from Kasetsart University advised by Prof. Jittat Fakcharoenphol. He is interested in approximation algorithms and graph theory, and has worked on network design problems, network flow problems, and some structural graph theory problems. Recently, he has also developed some interests in fast (up to sub-cubic) approximation algorithms.

 

Saba Ahmadi. Saba is a fifth year PhD student at the University of Maryland College Park visiting Northwestern, and she is advised by Prof. Samir Khuller. She is mainly interested in designing approximation algorithms. She is also interested in the topic of fairness and its relevance to combinatorial optimization. She finds problems at the intersection of AI and combinatorial optimization very interesting, one example is how to have diversity in the matching markets.

 

Sheng Yang. Sheng is a fifth year PhD student at University of Maryland, College Park advised by Prof. Samir Khuller. Currently, he is visiting Northwestern as a pre-doctoral visiting scholar. He is broadly interested in approximation algorithms. He has worked on graph theory topics related to connected dominating set and induced subgraph counting. Currently, he is mainly working on various scheduling problems, classical and new challenges originating from cloud computing.

Northwestern CS Theory group @ FCRC 2019

The Northwestern CS Theory group cumulatively had eight papers at conferences held as part of the ACM Federated Computing Research Conference(FCRC) 2019, which took place recently in Phoenix, AZ. We had two papers in COLT, four in EC, and one each in SPAA and STOC. Members of the group who attended include Professors Jason Hartline, Samir Khuller, Konstantin Makarychev, and Aravindan Vijayaraghavan, postdoc Xue Chen, PhD students Yiding Feng, Aleck Johnsen, Yingkai Li, and Aravind Reddy. Also in attendance were CS Prof. Jessica Hullman and Econ PhD student Modibo Camara.

Conference on Learning Theory(COLT):

TCS postdoc Xue Chen presented a joint paper “Active Regression via Linear-Sample Sparsification” with Eric Price (UT Austin). This paper gives an efficient algorithm with an optimal sample complexity for the classical problem of linear regression. Its techniques yield improved results for the non-linear sparse Fourier transform setting.

TCS Ph.D. student Yingkai Li presented a joint paper with Yining Wang (CMU) and Yuan Zhou (UIUC). The paper “Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits” obtained almost tight dependence on the time horizon for regret minimization in linear bandits. One of the main results is to show that there is an extra sqrt(log T) factor in the lower bound, revealing a regret scaling quite different from classical multi-armed bandits. Our proof techniques include variable confidence levels and a careful analysis of layer sizes of SupLinUCB on the upper bound side, and delicately constructed adversarial sequences showing the tightness of elliptical potential lemmas on the lower bound side.

Economics and Computation(EC):

TCS Ph.D. alumnus Sam Taggart (Assist. Prof. Oberlin College) presented a joint paper with TCS Prof. Jason Hartline in the joint session of EC and STOC.  The paper, “Sample Complexity for Non-truthful Mechanisms“, describes a family of winner-pays-bid and all-pay (i.e., non-truthful) mechanisms that can be optimized from past bid data from the same market.  Most mechanisms in practice are non-truthful and, for example, the winner-pays-bid format is especially common.  The literature on sample complexity in mechanism design is well established, however, this paper is the first to consider non-truthful mechanisms.

TCS Prof. Jason Hartline presented a joint paper with TCS Ph.D. student Aleck Johnsen, Dennis Nekipelov (UVA), and Onno Zoeter (Booking.com).  The paper, “Dashboard Mechanisms for Online Marketplaces” gives a theoretical model for how a bidding dashboard can both make it easier for bidders in an online marketplace to optimize their bids and for the designer to optimize the mechanism.  A key idea in the paper is that if bidders optimize their bids for the dashboard, which predicts the outcome that will be obtained for any bid, then the underlying preferences of the bidders can be easily inferred from the bids.  Link to Video.

TCS Ph.D. student Yingkai Li presented a joint paper with TCS Prof. Jason Hartline and TCS Ph.D. student Yiding Feng.  The paper, “Optimal Auctions vs. Anonymous Pricing: Beyond Linear Utility“, introduces a approximation framework which approximately reduces the analysis of anonymous pricing for agents with non-linear utility to agents with linear utility in revenue-maximization. Applying this framework, constant approximation guarantee of anonymous pricing is shown for agents with public-budget utility, private-budget utility, and (a special case of) risk-averse utility.  A key idea in the paper is to define a parameterization of the regularity property that extends to agents with non-linear utility. Link to Video.

TCS Ph.D. student Chenhao Zhang had a joint paper with Nick Gravin (Shanghai University of Finance and Economics), Yaonan Jin (Columbia University) and Pinyan Lu (Shanghai University of Finance and Economics). The paper “Optimal Budget-Feasible Mechanisms for Additive Valuations” obtained tight approximation guarantee for budget-feasible mechanisms with an additive buyer. The paper proposes two-stage mechanisms that composite price-posting schemes with a pruning mechanism which greedily excludes the items with low value-per-cost ratios. A tight 2-approximation against the Knapsack and a tight 3-approximation against the Fractional-Knapsack are obtained by the proposed randomized and deterministic mechanisms respectively. Link to Video.

Symposium on Parallelism in Algorithms and Architectures(SPAA):

TCS Prof. Samir Khuller had a joint paper with visiting TCS PhD student Sheng Yang(UMD), Mosharaf Chowdhury(UMich), Manish Purohit(Google), and Jie You(UMich). The paper “Near Optimal Coflow Scheduling in Networks” focuses on the coflow scheduling problem which studies scheduling and data communication inter and intra datacenters. The main result is a randomized 2 approximation algorithm, significantly improving prior work both in theory and in practice.

Symposium on the Theory of Computing(STOC):

TCS Prof. Konstantin Makarychev had a paper with Yury Makarychev(TTIC) and Ilya Razenshteyn (Microsoft Research). This paper, “Performance of Johnson-Lindenstrauss Transform for k-Means and k-Medians Clustering“, shows that the cost of the optimal solution for Euclidean k-means or k-medians clustering is preserved up to a factor of (1+ε) under a projection onto a random O(log(k/ε)/ε2)-dimensional subspace. For k-means, this result resolves an open problem posed by Cohen, Elder, Musco, Musco, and Persu (STOC 2015); for k-medians, it answers a question raised by Kannan.

Two Northwestern papers at FOCS 2017

The Northwestern CS Theory group had two papers at the 58th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2017), which was recently held in Berkeley, CA.

TCS Postdoc Huck Bennett had a joint paper with Alexander Golovnev (Columbia and Yahoo Research) and Noah Stephens-Davidowitz (Princeton). The paper, “On The Quantitative Hardness of CVP,” initiates the study of the fine-grained complexity of lattice problems, a study which is important to the rapidly developing field of lattice-based cryptography. As its main result, the paper shows strong hardness of the Closest Vector Problem (CVP) with certain parameters assuming the Strong Exponential Time Hypothesis (SETH).

TCS Prof. Aravindan Vijayaraghavan had a joint paper with Oded Regev
(NYU). The paper, “Learning Mixtures of Well-Separated Gaussians,”
studies the classic problem of learning a mixture of k spherical
Gaussian distributions. The paper tries to characterize the
minimum amount of separation needed between the components to
estimate the parameters (means) of the Gaussians, and presents lower
bounds and upper bounds towards this end.

 

 

Konstantin Makarychev joins Northwestern CS Theory Group!

makarychev-konstantinThe Computer Science Division at Northwestern University welcomes new faculty member Dr. Konstantin (Kostya) Makarychev as an Associate Professor, beginning immediately. Dr. Makarychev’s position is one of the ten new faculty lines in CS which were announced in June 2016.

Dr. Makarychev is a theoretical computer scientist working on approximation algorithms, beyond worst-case analysis, applications of high-dimension geometry to computer science, and combinatorial optimization for designing efficient algorithms for computationally hard problems.

Dr. Makarychev joins Northwestern from Microsoft Research in Redmond, WA (2012-2016) and IBM Research Labs in Yorktown Heights, NY (2007-2012). Further details of his background can be found on his personal webpage.

Please click here for details, and the announcement on Northwestern homepage.

« Older posts