In this thesis we prove intractability results for some well studied problems in computational learning and combinatorial optimization.
Minimizing and learning DNF expressions. We study the problem of finding the minimum size DNF formula for a function f : {0, 1}d [special characters omitted] {0, 1} given its truth table. We show that unless NP ⊆ DTIME([special characters omitted]), there is no polynomial time algorithm that approximates this problem to within factor d1−ϵ where ϵ > 0 is an arbitrarily small constant. Our result essentially matches the known O(d) approximation for the problem.
We also study the learnability of small size DNF formulas. We show that assuming NP [special characters omitted] RP, for arbitrarily small constant ϵ > 0 and any fixed positive integer t, a two term DNF cannot be PAC-learnt in polynomial time by a t term DNF to within ½ + ϵ accuracy. Under the same complexity assumption, we show that for arbitrarily small constants μ, ϵ > 0 and any fixed positive integer t, an AND function (i.e. a single term DNF) cannot be PAC-learnt in polynomial time under adversarial μ-noise (also known as agnostic learning) by a t-CNF to within ½ + ϵ accuracy. Our results improve upon the previously known hardness for these problems [1, 30, 31].
Learning intersection of two halfspaces. We show that unless NP= RP, it is hard to PAC-learn intersection of two halfspaces in [special characters omitted] using a hypothesis which is a function of up to ℓ linear threshold functions for any integer ℓ to within accuracy of ½ + ϵ for any constant ϵ > 0. Specifically, we show that for every integer ℓ and an arbitrarily small constant ϵ > 0, unless NP = RP, no polynomial time algorithm can distinguish whether there is an intersection of two halfspaces that correctly classifies a given set of labeled points in [special characters omitted], or whether any function of ℓ linear threshold functions can correctly classify at most ½ + ϵ fraction of the points. Our result is optimal up to an arbitrarily small constant ϵ > 0, and improves upon the previous NP-hardness for this problem [1].
Reconstructing multivariate polynomials over [special characters omitted][2]. We study the polynomial reconstruction problem for low-degree multivariate polynomials over [special characters omitted][2]. In this problem, we are given a set of points xi ∈ [special characters omitted][2]n and target values ζ i ∈ [special characters omitted][2] for each of these points, with the strong promise promise that there is a linear polynomial over [special characters omitted][2] that agrees with at least 1 − ϵ fraction of the point-value pairs. Our goal is to find a degree d polynomial that has good agreement with the set of point-value pairs. We show that it is NP-hard to find a polynomial that agrees with more than 1 − 2−d + δ fraction of the pairs for any constants ϵ, δ > 0 and positive integer d. This extends the previously known hardness of approximation (or even NP-completeness) for the case when d = 1, which follows from a celebrated result of Håstad [38]. In the setting of Computational Learning, our result shows the hardness of agnostic learning of parities, where the learner is allowed a low-degree polynomial over [special characters omitted][2] as a hypothesis.
SDP integrality gaps with Local ℓ1-embeddability . We construct integrality gap instances for SDP relaxation of the M
AXIMUM
C
UT
and the S
PARSEST
C
UT
problems. If the triangle inequality constraints are added to the SDP, then the SDP vectors naturally define an
n-point negative type metric where
n is the number of vertices in the problem instance. Our gap-instances satisfy a stronger constraint that every sub-metric on
t =
O((log log log
n)[special characters omitted]) points is isometrically embeddable into ℓ
1. The local ℓ
1-embeddability constraints are implied when the basic SDP relaxation is augmented with
t rounds of the Sherali-Adams LP-relaxation [75].
For the M
AXIMUM
C
UT
problem, we obtain an optimal gap of [special characters omitted] − ϵ, where
αGW is the Goemans-Williamson constant [33] and ϵ > 0 is an arbitrarily small constant. For the S
PARSEST
C
UT
problem, we obtain a gap of Ω((log log log
n)[special characters omitted]). The latter result can be rephrased as a construction of an
n-point negative type metric such that every
t-point sub-metric is isometrically ℓ
1-embeddable, but embedding the whole metric into ℓ
1 incurs distortion Ω((log log log
n)[special characters omitted]).
Integrality gap for U
NIFORM
S
PARSEST
C
UT
. Arora, Rao and Vazirani [7] showed that the standard semi-definite programming relaxation of the U
NIFORM
S
PARSEST
C
UT
problem with the
triangle inequality constraints has an integrality gap of
O([special characters omitted]). They conjectured that the gap is bounded from above by a constant. In this study, we disprove this conjecture (referred to as the ARV-Conjecture) by constructing an Ω(log log
n) integrality gap instance. Khot and Vishnoi [54] had earlier disproved the
non-uniform version of the ARV-Conjecture.