TCS+ | A carbon-free dissemination of ideas across the globe.

May 25, 2018

TCS+ talk: Wednesday, May 30th — Michael Kearns, UPenn

The next and last TCS+ talk for the Spring will take place this coming Wednesday, May 30th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 18:00 UTC). Michael Kearns from UPenn will speak about “Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness” (abstract below).

Please make sure you reserve a spot for your group to join us live by signing up on the online form. As usual, for more information about the TCS+ online seminar series and the upcoming talks, or to suggest a possible topic or speaker, please see the website.

Abstract: The most prevalent notions of fairness in machine learning are statistical definitions: they fix a small collection of pre-defined groups, and then ask for parity of some statistic of the classifier across these groups. Constraints of this form are susceptible to intentional or inadvertent “fairness gerrymandering”, in which a classifier appears to be fair on each individual group, but badly violates the fairness constraint on one or more structured subgroups defined over the protected attributes. We propose instead to demand statistical notions of fairness across exponentially (or infinitely) many subgroups, defined by a structured class of functions over the protected attributes. This interpolates between statistical definitions of fairness and recently proposed individual notions of fairness, but raises several computational challenges. It is no longer clear how to audit a fixed classifier to see if it satisfies such a strong definition of fairness.

We prove that the computational problem of auditing subgroup fairness for both equality of false positive rates and statistical parity is equivalent to the problem of weak agnostic learning, which means it is computationally hard in the worst case, even for simple structured subclasses. But it also raises the possibility of applying empirically successful machine learning methods to the problem.

We then derive two algorithms that provably converge to the best fair classifier, given access to oracles which can solve the agnostic learning problem. The algorithms are based on a formulation of subgroup fairness as a two-player zero-sum game between a Learner and an Auditor. Our first algorithm provably converges in a polynomial number of steps. Our second algorithm enjoys only provably asymptotic convergence, but has the merit of simplicity and faster per-step computation. We implement the simpler algorithm using linear regression as a heuristic oracle, and show that we can effectively both audit and learn fair classifiers on real datasets.

Joint work with Seth Neel, Aaron Roth, and Zhiwei Steven Wu.

Leave a comment

May 16, 2018

TCS+ talk: Wednesday, May 23rd — Leonard Schulman, Caltech

The next TCS+ talk will take place this coming Wednesday, May 23rd at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 18:00 UTC). Leonard Schulman from Caltech will speak about “Explicit Binary Tree Codes with Polylogarithmic Size Alphabet” (abstract below).

Abstract: Tree codes are “real time” or “causal” error-correcting codes. They are known to exist but explicit construction has been a longstanding open problem. We report on progress on this problem.

For every constant delta we give an explicit binary tree code with distance delta and alphabet size poly(log n), where n is the depth of the tree. This is the first improvement over a two-decade-old construction that has an exponentially larger alphabet of size poly(n).

As part of the analysis, we prove a bound on the number of positive integer roots a real polynomial can have in terms of its sparsity with respect to the Newton basis–a result of independent interest.

Joint work with Gil Cohen (Princeton) and Bernhard Haeupler (CMU)

Leave a comment

April 18, 2018

TCS+ talk: Wednesday, April 25th — Danupon Nanongkai, KTH

The next TCS+ talk will take place this coming Wednesday, April 25th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 18:00 UTC). Danupon Nanongkai from KTH will speak about “Distributed All-Pairs Shortest Paths, Exactly” (abstract below).

Abstract: I will present the ~O(n^{5/4})-time distributed algorithm for computing all-pairs shortest paths exactly by Huang, Nanongkai, and Saranurak (FOCS 2017; https://arxiv.org/abs/1708.03903). The algorithm is fairly simple, and the talk will cover necessary backgrounds. I will also briefly survey recent progresses and some open problems in the field of distributed graph algorithms, where this work lies in.

Leave a comment

April 4, 2018

TCS+ talk: Wednesday, April 11th — Shay Moran, IAS

The next TCS+ talk will take place this coming Wednesday, April 11th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 18:00 UTC). Shay Moran from the IAS will speak about “On the expressiveness of comparison queries” (abstract below).

Abstract: Comparisons are a classical and well studied algorithmic tool that is used in a variety of contexts and applications. We will discuss two manifestations of the expressiveness of these queries in machine learning and complexity theory (a more detailed overview is given below). Both manifestations are based on the notion of “inference dimension” that can be viewed as another instance of the fruitful link between machine learning and discrete mathematics — a link dating back to the discovery of the VC dimension.

Active classification with comparison queries.

Active learning is a model for semi-supervised learning that captures situations in which unlabeled data is abundant and manually labelling it is expensive. We consider an extension of active learning in which the learning algorithm may ask the annotator to compare the distances of two examples from the boundary of their label-class. For example, in a recommendation system application (say for restaurants), the annotator may be asked whether she liked or disliked a specific restaurant (a label query); or which one of two restaurants did she like more (a comparison query).

We prove that the usage of comparison queries leads to an exponential improvement in the query complexity of some well studied problems. Specifically, for the class of half-spaces, we show that under natural assumptions, such as large margin or bounded bit-description of the input examples, it is possible to reveal all the labels of a sample of size n using approximately O(logn) queries.

Nearly optimal linear decision trees for k-SUM and related problems.

We use the above mentioned techniques to construct linear decision trees for a variety of decision problems in combinatorics and discrete geometry. For example, for any constant k, we construct linear decision trees that solve the k-SUM problem on n elements using O(n log n) linear queries. Moreover, the queries we use are comparison queries, which compare the sums of two k-subsets; when viewed as linear queries, comparison queries are 2k-sparse and have only {−1,+1} coefficients.

We give similar constructions for sorting sumsets A+B and for solving the SUBSET-SUM problem, both with optimal number of queries, up to poly-logarithmic terms.

Based on joint works with Daniel Kane, Shachar Lovett, and Jiapeng Zhang.

Leave a comment

March 21, 2018

TCS+ talk: Wednesday, March 28th — Artur Czumaj, University of Warwick

The next TCS+ talk will take place this coming Wednesday, March 28th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 18:00 UTC). Artur Czumaj from the University of Warwick will speak about “Round Compression for Parallel Matching Algorithms” (abstract below).

Abstract: For over a decade now we have been witnessing the success of massive parallel computation (MPC) frameworks, such as MapReduce, Hadoop, Dryad, or Spark. One of the reasons for their success is the fact that these frameworks are able to accurately capture the nature of large-scale computation. In particular, compared to the classic distributed algorithms or PRAM models, these frameworks allow for much more local computation. The fundamental question that arises in this context is though: can we leverage this additional power to obtain even faster parallel algorithms?

A prominent example here is the fundamental graph problem of finding maximum matching. It is well known that in the PRAM model one can compute a 2-approximate maximum matching in O(log n) rounds. However, the exact complexity of this problem in the MPC framework is still far from understood. Lattanzi et al. showed that if each machine has n^{1+Ω(1)} memory, this problem can also be solved 2-approximately in a constant number of rounds. These techniques, as well as the approaches developed in the follow up work, seem though to get stuck in a fundamental way at roughly O(log n) rounds once we enter the near-linear memory regime. It is thus entirely possible that in this regime, which captures in particular the case of sparse graph computations, the best MPC round complexity matches what one can already get in the PRAM model, without the need to take advantage of the extra local computation power.

In this talk, we finally show how to refute that perplexing possibility. That is, we break the above O(log n) round complexity bound even in the case of slightly sublinear memory per machine. In fact, our improvement here is almost exponential: we are able to deliver a (2+ϵ) approximation to maximum matching, for any fixed constant ϵ>0, in O((loglog n)^2) rounds.

This is a joint work with Jakub Łącki, Aleksander Mądry, Slobodan Mitrović, Krzysztof Onak, and Piotr Sankowski.

Leave a comment

March 7, 2018

TCS+ talk: Wednesday, March 14th — Nima Anari, Stanford

The next TCS+ talk will take place this coming Wednesday, March 14th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 18:00 Central European Time, 17:00 UTC). Nima Anari from Stanford will speak about “Planar Graph Perfect Matching is in NC” (abstract below).

Abstract: Is matching in NC? In other words, is there a deterministic fast parallel algorithm for it? This has been an open question for over three decades, ever since the discovery of Random NC matching algorithms. Within this question, the case of planar graphs had remained an enigma: On one hand, counting the number of perfect matchings is generally believed to be harder than finding one (the former is #P-complete and the latter is in P), and on the other, for planar graphs, counting has long been known to be in NC whereas finding one has resisted a solution.

The case of bipartite planar graphs was solved by Miller and Naor in 1989 via a flow-based algorithm. In 2000, Mahajan and Varadarajan gave an elegant way of using counting matchings to finding one, hence giving a different NC algorithm.

However, non-bipartite planar graphs still didn’t yield: the stumbling block being odd tight cuts. Interestingly, these are also a key to the solution: a balanced odd tight cut leads to a straight-forward divide and conquer NC algorithm. The remaining task is to find such a cut in NC. This requires several algorithmic ideas, such as finding a point in the interior of the minimum weight face of the perfect matching polytope and uncrossing odd tight cuts.

Paper available at: https://arxiv.org/pdf/1709.07822.pdf

Joint work with Vijay Vazirani.

Leave a comment

March 1, 2018

First three talks of the Spring!

Hi everyone,

we have posted the first three talks of the spring semester.

Avi Wigderson (IAS): Optimization, Complexity and Math (through the lens of one problem and one algorithm)
Dor Minzer (Tel Aviv University): 2-to-2 Games via expansion on the Grassmann Graph
Sanjam Garg (Berkeley): Identity-Based Encryption from the Diffie-Hellman Assumption

We are still deciding who to host next, so stay tuned!

Leave a comment

February 21, 2018

TCS+ talk: Wednesday, February 28th — Sanjam Garg, UC Berkeley

The next TCS+ talk will take place this coming Wednesday, February 28th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 18:00 UTC). Sanjam Garg from UC Berkeley will speak about “Identity-Based Encryption from the Diffie-Hellman Assumption” (abstract below).

Abstract: In this talk, I will describe new constructions of identity-based encryption based on the hardness of the Diffie-Hellman (without using groups with pairings) Problem. Previously, constructions based on this assumption were believed to be impossible. Our construction is based on new techniques that bypass the known impossibility results using garbled circuits that make a non-black-box use of the underlying cryptographic primitives.

(Based on joint work with Nico Döttling.)

Leave a comment

February 7, 2018

TCS+ talk: Wednesday, February 14th — Dor Minzer, Tel-Aviv University

The next TCS+ talk will take place this coming Wednesday, February 14th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 18:00 UTC). Dor Minzer from Tel-Aviv University will speak about “2-to-2 Games via expansion on the Grassmann Graph” (abstract below).

Abstract: A fundamental goal in the theory of PCPs asks whether a special type of PCP, namely 2-to-2 Games, exists. This is a variant of Khot’s well-known Unique Games conjecture.

In this talk we will discuss a recent line of study establishing the 2-to-2 games conjecture, albeit with imperfect completeness.
At the heart of the approach lies an object called the Grassmann Graph, and a certain linearity test on it.
This leads to the study of edge expansion in this graph, and in particular, the study of (small) sets of vertices in the Grassmann Graph, whose edge expansion is bounded away from 1.

Based on joint works with Irit Dinur, Subhash Khot, Guy Kindler and Muli Safra

Leave a comment

January 24, 2018

TCS+ talk: Wednesday, January 31st — Avi Wigderson, IAS

The first TCS+ talk of the new year will take place this coming Wednesday, January 31st at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 18:00 UTC). We’re honored to have Avi Wigderson from IAS as our speaker; Avi will speak about “Optimization, Complexity and Math (through the lens of one problem and one algorithm)” (abstract below).

Abstract: In this lecture, we introduce and motivate the main characters in this plot:
– Singularity of symbolic matrices: a basic problem in both computational complexity.
– Alternating Minimization: a basic heuristic in non-convex optimization.

I will explain how variants of this algorithm are applied to variants of this problem, how they are analyzed, and how the analysis gives rise to problems in and connections between a surprisingly diverse set of mathematical areas, including quantum information theory, non-commutative algebra and invariant theory, and analysis. Time permitting, we will discuss challenges this work raises in invariant theory and non-convex optimization.

Leave a comment