A Quantum Approximation Scheme for k-Means

2308.08167

Published 5/27/2024 by Ragesh Jaiswal

🌿

Abstract

We give a quantum approximation scheme (i.e., $(1 + varepsilon)$-approximation for every $varepsilon > 0$) for the classical $k$-means clustering problem in the QRAM model with a running time that has only polylogarithmic dependence on the number of data points. More specifically, given a dataset $V$ with $N$ points in $mathbb{R}^d$ stored in QRAM data structure, our quantum algorithm runs in time $tilde{O} left( 2^{tilde{O}(frac{k}{varepsilon})} eta^2 dright)$ and with high probability outputs a set $C$ of $k$ centers such that $cost(V, C) leq (1+varepsilon) cdot cost(V, C_{OPT})$. Here $C_{OPT}$ denotes the optimal $k$-centers, $cost(.)$ denotes the standard $k$-means cost function (i.e., the sum of the squared distance of points to the closest center), and $eta$ is the aspect ratio (i.e., the ratio of maximum distance to minimum distance). This is the first quantum algorithm with a polylogarithmic running time that gives a provable approximation guarantee of $(1+varepsilon)$ for the $k$-means problem. Also, unlike previous works on unsupervised learning, our quantum algorithm does not require quantum linear algebra subroutines and has a running time independent of parameters (e.g., condition number) that appear in such procedures.

Create account to get full access

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🌿

QUACK: Quantum Aligned Centroid Kernel

Kilian Tscharke, Sebastian Issel, Pascal Debus

Quantum computing (QC) seems to show potential for application in machine learning (ML). In particular quantum kernel methods (QKM) exhibit promising properties for use in supervised ML tasks. However, a major disadvantage of kernel methods is their unfavorable quadratic scaling with the number of training samples. Together with the limits imposed by currently available quantum hardware (NISQ devices) with their low qubit coherence times, small number of qubits, and high error rates, the use of QC in ML at an industrially relevant scale is currently impossible. As a small step in improving the potential applications of QKMs, we introduce QUACK, a quantum kernel algorithm whose time complexity scales linear with the number of samples during training, and independent of the number of training samples in the inference stage. In the training process, only the kernel entries for the samples and the centers of the classes are calculated, i.e. the maximum shape of the kernel for n samples and c classes is (n, c). During training, the parameters of the quantum kernel and the positions of the centroids are optimized iteratively. In the inference stage, for every new sample the circuit is only evaluated for every centroid, i.e. c times. We show that the QUACK algorithm nevertheless provides satisfactory results and can perform at a similar level as classical kernel methods with quadratic scaling during training. In addition, our (simulated) algorithm is able to handle high-dimensional datasets such as MNIST with 784 features without any dimensionality reduction.

5/2/2024

cs.LG

🛠️

Quantum Algorithms and Lower Bounds for Finite-Sum Optimization

Yexin Zhang, Chenyi Zhang, Cong Fang, Liwei Wang, Tongyang Li

Finite-sum optimization has wide applications in machine learning, covering important problems such as support vector machines, regression, etc. In this paper, we initiate the study of solving finite-sum optimization problems by quantum computing. Specifically, let $f_1,ldots,f_ncolonmathbb{R}^dtomathbb{R}$ be $ell$-smooth convex functions and $psicolonmathbb{R}^dtomathbb{R}$ be a $mu$-strongly convex proximal function. The goal is to find an $epsilon$-optimal point for $F(mathbf{x})=frac{1}{n}sum_{i=1}^n f_i(mathbf{x})+psi(mathbf{x})$. We give a quantum algorithm with complexity $tilde{O}big(n+sqrt{d}+sqrt{ell/mu}big(n^{1/3}d^{1/3}+n^{-2/3}d^{5/6}big)big)$, improving the classical tight bound $tilde{Theta}big(n+sqrt{nell/mu}big)$. We also prove a quantum lower bound $tilde{Omega}(n+n^{3/4}(ell/mu)^{1/4})$ when $d$ is large enough. Both our quantum upper and lower bounds can extend to the cases where $psi$ is not necessarily strongly convex, or each $f_i$ is Lipschitz but not necessarily smooth. In addition, when $F$ is nonconvex, our quantum algorithm can find an $epsilon$-critial point using $tilde{O}(n+ell(d^{1/3}n^{1/3}+sqrt{d})/epsilon^2)$ queries.

6/6/2024

cs.DS cs.LG

Probabilistic Sampling of Balanced K-Means using Adiabatic Quantum Computing

Jan-Nico Zaech, Martin Danelljan, Tolga Birdal, Luc Van Gool

Adiabatic quantum computing (AQC) is a promising approach for discrete and often NP-hard optimization problems. Current AQCs allow to implement problems of research interest, which has sparked the development of quantum representations for many computer vision tasks. Despite requiring multiple measurements from the noisy AQC, current approaches only utilize the best measurement, discarding information contained in the remaining ones. In this work, we explore the potential of using this information for probabilistic balanced k-means clustering. Instead of discarding non-optimal solutions, we propose to use them to compute calibrated posterior probabilities with little additional compute cost. This allows us to identify ambiguous solutions and data points, which we demonstrate on a D-Wave AQC on synthetic tasks and real visual data.

5/2/2024

cs.LG cs.AI cs.CV

📈

Simple algorithms to test and learn local Hamiltonians

Francisco Escudero Guti'errez

We consider the problems of testing and learning an $n$-qubit $k$-local Hamiltonian from queries to its evolution operator with respect the 2-norm of the Pauli spectrum, or equivalently, the normalized Frobenius norm. For testing whether a Hamiltonian is $epsilon_1$-close to $k$-local or $epsilon_2$-far from $k$-local, we show that $O(1/(epsilon_2-epsilon_1)^{8})$ queries suffice. This solves two questions posed in a recent work by Bluhm, Caro and Oufkir. For learning up to error $epsilon$, we show that $exp(O(k^2+klog(1/epsilon)))$ queries suffice. Our proofs are simple, concise and based on Pauli-analytic techniques.

4/10/2024

cs.CC cs.DS cs.IT cs.LG