Sparse PCA: Algorithms, Adversarial Perturbations and Certificates

with Tommaso d'Orsi, Gleb Novikov, Pravesh Kothari. FOCS 2020.

abstract

We study efficient algorithms for Sparse PCA in standard statistical models (spiked covariance in its Wishart form). Our goal is to achieve optimal recovery guarantees while being resilient to small perturbations. Despite a long history of prior works, including explicit studies of perturbation resilience, known algorithmic guarantees for Sparse PCA are fragile and break down under small adversarial perturbations.

Our key insight is to observe a basic connection between perturbation resilience and certifying algorithms that are based on certificates of upper bounds on sparse eigenvalues of random matrices. In contrast to other techniques, such certifying algorithms, including the brute-force maximum likelihood estimator, are automatically robust against small adversarial perturbation.

We use this connection to obtain the first polynomial-time algorithms for this problem that are resilient against adversarial additive perturbations by obtaining new efficient certificates for upper bounds on sparse eigenvalues of random matrices. Our algorithms are based either on basic semidefinite programming or on its low-degree sum-of-squares strengthening depending on the parameter regimes. Their guarantees either match or approach the best known guarantees of fragile algorithms in terms of sparsity of the unknown vector, number of samples and the ambient dimension.

To complement our algorithmic results, we prove matching lower bounds on the gap between fragile and robust polynomial-time algorithms in a natural computational model based on low-degree polynomials (closely related to the pseudo-calibration technique for sum-of-squares lower bounds) that is known to capture the best known guarantees for related statistical estimation problems.

Beyond these issues of perturbation resilience, our analysis also leads to new algorithms for the fragile setting, whose guarantees improve over best previous results in some parameter regimes (e.g. if the sample size is polynomially smaller than the dimension).

keywords

high-dimensional estimation
sum of squares
lower bounds