The phase diagram of kernel interpolation in large dimensions

Read original: arXiv:2404.12597 - Published 4/22/2024 by Haobo Zhang, Weihao Lu, Qian Lin

👀

Overview

This paper presents a template for citing AI research papers in the PRIME AI style, which includes the authors' names, the title, the page numbers, and the DOI.
The paper was generated using LaTeXML, a tool for converting LaTeX documents to HTML.
The template includes various CSS and JavaScript files to style the page and add interactive features.

Plain English Explanation

This paper provides a template for citing AI research papers in a specific format called the PRIME AI style. The PRIME AI style includes key information about the paper, such as the authors' names, the title, the page numbers, and the digital object identifier (DOI).

The paper was created using a tool called LaTeXML, which can convert LaTeX documents (a common format for technical papers) into HTML webpages. The template includes several files that add styling and interactive features to the webpage, such as a table of contents and the ability to take screenshots.

The main purpose of this paper is to provide a standardized way for researchers and readers to properly cite AI papers in a consistent format. This can help make it easier to find and reference relevant research in the field of AI.

Technical Explanation

The paper presents a template for citing AI research papers in the PRIME AI style. This template includes the authors' names, the title of the paper, the page numbers, and the DOI (digital object identifier).

The template was generated using LaTeXML, a tool that can convert LaTeX documents into HTML webpages. The HTML page includes various CSS and JavaScript files to style the page and add interactive features, such as a table of contents and the ability to take screenshots of the page.

The template is designed to provide a standardized way for researchers and readers to properly cite AI papers in a consistent format. This can help improve the discoverability and accessibility of relevant AI research.

Critical Analysis

The template provided in this paper offers a useful and standardized approach for citing AI research papers. By including key bibliographic information like the authors, title, pages, and DOI, the PRIME AI style can help make it easier for researchers to find and reference relevant studies.

However, the template is quite basic and does not include more advanced citation features, such as support for citing specific sections or equations within a paper. Additionally, the template is focused solely on the citation format and does not provide any guidance on how to critically evaluate or summarize the research presented in the cited papers.

Further research could explore ways to integrate the PRIME AI citation style with other tools and workflows used by the AI research community. Integrating the citation format with reference management software or publication platforms, for example, could help streamline the citation process for researchers.

Overall, this paper presents a solid foundation for a standardized citation style for AI research. However, there is room for the template to evolve and incorporate more advanced features to better support the needs of the AI research community.

Conclusion

This paper provides a template for citing AI research papers in the PRIME AI style, which includes key bibliographic information such as the authors, title, pages, and DOI. The template was generated using LaTeXML, a tool for converting LaTeX documents to HTML, and includes various CSS and JavaScript files to style the page and add interactive features.

The PRIME AI citation style offers a standardized approach that can help improve the discoverability and accessibility of relevant AI research. While the template is fairly basic, it provides a solid foundation that could be built upon to incorporate more advanced citation features and better support the needs of the AI research community.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

👀

The phase diagram of kernel interpolation in large dimensions

Haobo Zhang, Weihao Lu, Qian Lin

The generalization ability of kernel interpolation in large dimensions (i.e., $n asymp d^{gamma}$ for some $gamma>0$) might be one of the most interesting problems in the recent renaissance of kernel regression, since it may help us understand the 'benign overfitting phenomenon' reported in the neural networks literature. Focusing on the inner product kernel on the sphere, we fully characterized the exact order of both the variance and bias of large-dimensional kernel interpolation under various source conditions $sgeq 0$. Consequently, we obtained the $(s,gamma)$-phase diagram of large-dimensional kernel interpolation, i.e., we determined the regions in $(s,gamma)$-plane where the kernel interpolation is minimax optimal, sub-optimal and inconsistent.

4/22/2024

↗️

Optimal Rate of Kernel Regression in Large Dimensions

Weihao Lu, Haobo Zhang, Yicheng Li, Manyun Xu, Qian Lin

We perform a study on kernel regression for large-dimensional data (where the sample size $n$ is polynomially depending on the dimension $d$ of the samples, i.e., $nasymp d^{gamma}$ for some $gamma >0$ ). We first build a general tool to characterize the upper bound and the minimax lower bound of kernel regression for large dimensional data through the Mendelson complexity $varepsilon_{n}^{2}$ and the metric entropy $bar{varepsilon}_{n}^{2}$ respectively. When the target function falls into the RKHS associated with a (general) inner product model defined on $mathbb{S}^{d}$, we utilize the new tool to show that the minimax rate of the excess risk of kernel regression is $n^{-1/2}$ when $nasymp d^{gamma}$ for $gamma =2, 4, 6, 8, cdots$. We then further determine the optimal rate of the excess risk of kernel regression for all the $gamma>0$ and find that the curve of optimal rate varying along $gamma$ exhibits several new phenomena including the multiple descent behavior and the periodic plateau behavior. As an application, For the neural tangent kernel (NTK), we also provide a similar explicit description of the curve of optimal rate. As a direct corollary, we know these claims hold for wide neural networks as well.

7/1/2024

Kernel Density Estimators in Large Dimensions

Giulio Biroli, Marc M'ezard

This paper studies Kernel density estimation for a high-dimensional distribution $rho(x)$. Traditional approaches have focused on the limit of large number of data points $n$ and fixed dimension $d$. We analyze instead the regime where both the number $n$ of data points $y_i$ and their dimensionality $d$ grow with a fixed ratio $alpha=(log n)/d$. Our study reveals three distinct statistical regimes for the kernel-based estimate of the density $hat rho_h^{mathcal {D}}(x)=frac{1}{n h^d}sum_{i=1}^n Kleft(frac{x-y_i}{h}right)$, depending on the bandwidth $h$: a classical regime for large bandwidth where the Central Limit Theorem (CLT) holds, which is akin to the one found in traditional approaches. Below a certain value of the bandwidth, $h_{CLT}(alpha)$, we find that the CLT breaks down. The statistics of $hat rho_h^{mathcal {D}}(x)$ for a fixed $x$ drawn from $rho(x)$ is given by a heavy-tailed distribution (an alpha-stable distribution). In particular below a value $h_G(alpha)$, we find that $hat rho_h^{mathcal {D}}(x)$ is governed by extreme value statistics: only a few points in the database matter and give the dominant contribution to the density estimator. We provide a detailed analysis for high-dimensional multivariate Gaussian data. We show that the optimal bandwidth threshold based on Kullback-Leibler divergence lies in the new statistical regime identified in this paper. Our findings reveal limitations of classical approaches, show the relevance of these new statistical regimes, and offer new insights for Kernel density estimation in high-dimensional settings.

8/19/2024

Scaling and renormalization in high-dimensional regression

Alexander Atanasov, Jacob A. Zavatone-Veth, Cengiz Pehlevan

This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models using the basic tools of random matrix theory and free probability. We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning. Analytic formulas for the training and generalization errors are obtained in a few lines of algebra directly from the properties of the $S$-transform of free probability. This allows for a straightforward identification of the sources of power-law scaling in model performance. We compute the generalization error of a broad class of random feature models. We find that in all models, the $S$-transform corresponds to the train-test generalization gap, and yields an analogue of the generalized-cross-validation estimator. Using these techniques, we derive fine-grained bias-variance decompositions for a very general class of random feature models with structured covariates. These novel results allow us to discover a scaling regime for random feature models where the variance due to the features limits performance in the overparameterized setting. We also demonstrate how anisotropic weight structure in random feature models can limit performance and lead to nontrivial exponents for finite-width corrections in the overparameterized setting. Our results extend and provide a unifying perspective on earlier models of neural scaling laws.

6/27/2024