Gradient Compressed Sensing: A Query-Efficient Gradient Estimator for High-Dimensional Zeroth-Order Optimization

Read original: arXiv:2405.16805 - Published 5/28/2024 by Ruizhong Qiu, Hanghang Tong

Gradient Compressed Sensing: A Query-Efficient Gradient Estimator for High-Dimensional Zeroth-Order Optimization

Overview

Explains the submission and formatting instructions for the International Conference on Machine Learning (ICML) 2023
Covers electronic submission, including paper formatting, LaTeX guidelines, and deadlines
Provides information on the review process and presentation requirements

Plain English Explanation

This paper outlines the guidelines and requirements for submitting papers to the International Conference on Machine Learning (ICML) 2023. The key points are:

Electronic Submission: Papers must be submitted electronically through the conference website. This includes providing metadata about the paper, such as the title, author names, and keywords.
Formatting: Papers must be formatted according to specific guidelines, including page limits, font sizes, and margin requirements. The paper should be prepared using LaTeX, with instructions provided for formatting the document.
Deadlines: There are strict deadlines for submitting papers, which are important to follow to ensure the paper is considered for the conference.
Review Process: Submitted papers will go through a peer-review process, where experts in the field will evaluate the technical merits and novelty of the research.
Presentation: Authors of accepted papers will be required to present their work at the conference, either through a talk or a poster session.

Overall, the goal of these instructions is to ensure a consistent and fair review process, as well as a high-quality conference experience for all attendees.

Technical Explanation

The submission and formatting instructions cover several key aspects of the ICML 2023 conference:

Electronic Submission: Authors must submit their papers electronically through the conference website. This involves providing metadata about the paper, such as the title, author names, and keywords. The paper itself must be uploaded as a PDF file.

Formatting: Papers must adhere to specific formatting guidelines, including a page limit of 9 pages (not including references), a font size of 10 or 11 points, and one-inch margins. The paper must be prepared using LaTeX, with instructions provided for formatting the document, including the use of specific LaTeX packages and templates.

Deadlines: There are strict deadlines for submitting papers, which vary based on the type of submission (e.g., regular papers, short papers, or abstracts). It is important for authors to be aware of and meet these deadlines to ensure their paper is considered for the conference.

Review Process: Submitted papers will go through a peer-review process, where experts in the field will evaluate the technical merits and novelty of the research. The review process is double-blind, meaning that the identities of the authors and reviewers are not revealed to each other.

Presentation: Authors of accepted papers will be required to present their work at the conference, either through a talk or a poster session. Specific guidelines for the presentation format and duration are provided.

Critical Analysis

The submission and formatting instructions provide a clear and comprehensive set of guidelines for authors submitting to ICML 2023. The strict formatting and deadline requirements are typical for major conferences in the field of machine learning and are designed to ensure a fair and consistent review process.

One potential limitation of the instructions is the reliance on LaTeX for paper preparation. While LaTeX is a widely used typesetting system in the scientific community, some authors may be more comfortable using other tools, such as Microsoft Word or Google Docs. The instructions could be improved by providing guidance on how to convert documents prepared in other formats to the required LaTeX format.

Additionally, the instructions do not provide much detail on the review process itself, such as the criteria used by reviewers or the timeline for receiving feedback. This information could be helpful for authors to better understand the expectations and plan their submission accordingly.

Overall, the submission and formatting instructions are well-organized and cover the key aspects of the ICML 2023 conference. With a clear understanding of these guidelines, authors can focus on preparing high-quality research papers that have the best chance of being accepted and presented at the prestigious ICML conference.

Conclusion

The submission and formatting instructions for the International Conference on Machine Learning (ICML) 2023 provide a detailed roadmap for authors to prepare and submit their research papers. The guidelines cover the electronic submission process, formatting requirements, deadlines, the review process, and presentation expectations.

By adhering to these instructions, authors can ensure that their papers are properly formatted and submitted in a timely manner, increasing the chances of their work being accepted and presented at the conference. The clear and comprehensive nature of the instructions reflects the ICML's commitment to maintaining high standards and providing a fair evaluation of the submitted research.

Overall, the submission and formatting instructions are an essential resource for anyone interested in participating in the ICML 2023 conference and contributing to the advancement of the field of machine learning.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Gradient Compressed Sensing: A Query-Efficient Gradient Estimator for High-Dimensional Zeroth-Order Optimization

Ruizhong Qiu, Hanghang Tong

We study nonconvex zeroth-order optimization (ZOO) in a high-dimensional space $mathbb R^d$ for functions with approximately $s$-sparse gradients. To reduce the dependence on the dimensionality $d$ in the query complexity, high-dimensional ZOO methods seek to leverage gradient sparsity to design gradient estimators. The previous best method needs $Obig(slogfrac dsbig)$ queries per step to achieve $Obig(frac1Tbig)$ rate of convergence w.r.t. the number T of steps. In this paper, we propose *Gradient Compressed Sensing* (GraCe), a query-efficient and accurate estimator for sparse gradients that uses only $Obig(sloglogfrac dsbig)$ queries per step and still achieves $Obig(frac1Tbig)$ rate of convergence. To our best knowledge, we are the first to achieve a *double-logarithmic* dependence on $d$ in the query complexity under weaker assumptions. Our proposed GraCe generalizes the Indyk--Price--Woodruff (IPW) algorithm in compressed sensing from linear measurements to nonlinear functions. Furthermore, since the IPW algorithm is purely theoretical due to its impractically large constant, we improve the IPW algorithm via our *dependent random partition* technique together with our corresponding novel analysis and successfully reduce the constant by a factor of nearly 4300. Our GraCe is not only theoretically query-efficient but also achieves strong empirical performance. We benchmark our GraCe against 12 existing ZOO methods with 10000-dimensional functions and demonstrate that GraCe significantly outperforms existing methods.

5/28/2024

New!Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms

Bin Gu, Xiyuan Wei, Hualin Zhang, Yi Chang, Heng Huang

Zeroth-order (ZO) optimization is one key technique for machine learning problems where gradient calculation is expensive or impossible. Several variance reduced ZO proximal algorithms have been proposed to speed up ZO optimization for non-smooth problems, and all of them opted for the coordinated ZO estimator against the random ZO estimator when approximating the true gradient, since the former is more accurate. While the random ZO estimator introduces bigger error and makes convergence analysis more challenging compared to coordinated ZO estimator, it requires only $mathcal{O}(1)$ computation, which is significantly less than $mathcal{O}(d)$ computation of the coordinated ZO estimator, with $d$ being dimension of the problem space. To take advantage of the computationally efficient nature of the random ZO estimator, we first propose a ZO objective decrease (ZOOD) property which can incorporate two different types of errors in the upper bound of convergence rate. Next, we propose two generic reduction frameworks for ZO optimization which can automatically derive the convergence results for convex and non-convex problems respectively, as long as the convergence rate for the inner solver satisfies the ZOOD property. With the application of two reduction frameworks on our proposed ZOR-ProxSVRG and ZOR-ProxSAGA, two variance reduced ZO proximal algorithms with fully random ZO estimators, we improve the state-of-the-art function query complexities from $mathcal{O}left(min{frac{dn^{1/2}}{epsilon^2}, frac{d}{epsilon^3}}right)$ to $tilde{mathcal{O}}left(frac{n+d}{epsilon^2}right)$ under $d > n^{frac{1}{2}}$ for non-convex problems, and from $mathcal{O}left(frac{d}{epsilon^2}right)$ to $tilde{mathcal{O}}left(nlogfrac{1}{epsilon}+frac{d}{epsilon}right)$ for convex problems.

10/4/2024

Gradient-Free Method for Heavily Constrained Nonconvex Optimization

Wanli Shi, Hongchang Gao, Bin Gu

Zeroth-order (ZO) method has been shown to be a powerful method for solving the optimization problem where explicit expression of the gradients is difficult or infeasible to obtain. Recently, due to the practical value of the constrained problems, a lot of ZO Frank-Wolfe or projected ZO methods have been proposed. However, in many applications, we may have a very large number of nonconvex white/black-box constraints, which makes the existing zeroth-order methods extremely inefficient (or even not working) since they need to inquire function value of all the constraints and project the solution to the complicated feasible set. In this paper, to solve the nonconvex problem with a large number of white/black-box constraints, we proposed a doubly stochastic zeroth-order gradient method (DSZOG) with momentum method and adaptive step size. Theoretically, we prove DSZOG can converge to the $epsilon$-stationary point of the constrained problem. Experimental results in two applications demonstrate the superiority of our method in terms of training time and accuracy compared with other ZO methods for the constrained problem.

9/4/2024

🛠️

Compressed Sensing: A Discrete Optimization Approach

Dimitris Bertsimas, Nicholas A. G. Johnson

We study the Compressed Sensing (CS) problem, which is the problem of finding the most sparse vector that satisfies a set of linear measurements up to some numerical tolerance. We introduce an $ell_2$ regularized formulation of CS which we reformulate as a mixed integer second order cone program. We derive a second order cone relaxation of this problem and show that under mild conditions on the regularization parameter, the resulting relaxation is equivalent to the well studied basis pursuit denoising problem. We present a semidefinite relaxation that strengthens the second order cone relaxation and develop a custom branch-and-bound algorithm that leverages our second order cone relaxation to solve small-scale instances of CS to certifiable optimality. When compared against solutions produced by three state of the art benchmark methods on synthetic data, our numerical results show that our approach produces solutions that are on average $6.22%$ more sparse. When compared only against the experiment-wise best performing benchmark method on synthetic data, our approach produces solutions that are on average $3.10%$ more sparse. On real world ECG data, for a given $ell_2$ reconstruction error our approach produces solutions that are on average $9.95%$ more sparse than benchmark methods ($3.88%$ more sparse if only compared against the best performing benchmark), while for a given sparsity level our approach produces solutions that have on average $10.77%$ lower reconstruction error than benchmark methods ($1.42%$ lower error if only compared against the best performing benchmark). When used as a component of a multi-label classification algorithm, our approach achieves greater classification accuracy than benchmark compressed sensing methods. This improved accuracy comes at the cost of an increase in computation time by several orders of magnitude.

7/15/2024