TCS+ talk: Wednesday, February 20th, Sepehr Assadi, Princeton
The next TCS+ talk will take place this coming Wednesday, February 20th at
1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European
Time, 18:00 UTC). Sepehr Assadi from Princeton University will speak about “A Simple Sublinear-Time Algorithm for Counting Arbitrary Subgraphs via Edge Sampling” (abstract below).
Please make sure you reserve a spot for your group to join us live by signing up on the online form. As usual, for more information about the TCS+ online seminar series and the upcoming talks, or to suggest a possible topic or speaker, please see the website.
Abstract: In the subgraph counting problem, we are given a (large) graph
and a (small) graph
(e.g., a triangle), and the goal is to estimate the number of occurrences of
in
. Our focus in this talk is on designing sublinear-time algorithms for approximately computing number of occurrences of
in
in the setting where the algorithm is given query access to
. This problem has been studied in several recent work which primarily focused on specific families of graphs H such as triangles, cliques, and stars. However, not much is known about approximate counting of arbitrary graphs
in the literature. This is in sharp contrast to the closely related subgraph enumeration problem in which the goal is to list all copies of the subgraph
in
. The AGM bound shows that the maximum number of occurrences of any arbitrary subgraph
in a graph
with
edges is
, where
is the fractional edge cover number of
, and enumeration algorithms with matching runtime are known for every
.
In this talk, we bridge this gap between the subgraph counting and subgraph enumeration problems and present a simple sublinear-time algorithm that estimates the number of occurrences of any arbitrary graph
in
, denoted by
, to within a
-approximation factor with high probability in
time. Our algorithm is allowed the standard set of queries for general graphs, namely degree queries, pair queries and neighbor queries, plus an additional edge-sample query that returns an edge chosen uniformly at random. The performance of our algorithm matches those of Eden et al. [FOCS 2015, STOC 2018] for counting triangles and cliques and extend them to all choices of subgraph
under the additional assumption of edge-sample queries. Our results are also applicable to the more general problem of database join size estimation problem and for this slightly more general problem achieve optimal bounds for every choice of
.
Joint work with Michael Kapralov and Sanjeev Khanna.