Attention Beats Linear for Fast Implicit Neural Representation Generation

Read original: arXiv:2407.15355 - Published 7/23/2024 by Shuyi Zhang, Ke Liu, Jingjun Gu, Xiaoxu Cai, Zhihua Wang, Jiajun Bu, Haishuai Wang

Attention Beats Linear for Fast Implicit Neural Representation Generation

Overview

The provided text is the "Author Guidelines for ECCV Submission", which outlines the requirements for submitting a paper to the European Conference on Computer Vision (ECCV).
It covers the initial submission process, formatting guidelines, and other important information for authors.

Plain English Explanation

The ECCV is a prestigious academic conference that focuses on computer vision research. The author guidelines explain the steps required to submit a paper for consideration at the conference.

Some of the key points include:

[object Object]: Authors must follow specific formatting rules when submitting their paper, such as using a standard template and adhering to page limits.
[object Object]: Papers will be peer-reviewed by experts in the field, and authors may be asked to revise their work based on reviewer feedback.
[object Object]: After the reviewing process, authors of accepted papers must submit a final version that incorporates any required changes.
[object Object]: Authors of accepted papers will be expected to present their work at the conference.

The guidelines aim to ensure a fair and consistent review process, as well as a high-quality conference program for attendees.

Technical Explanation

The author guidelines cover the following key elements:

Initial Submission:

Authors must use the official ECCV paper template, which specifies formatting requirements like page limits, font sizes, and margin sizes.
Submissions should be made through the conference's online submission system.
Papers must be in PDF format and cannot exceed the specified page limit.

Reviewing Process:

Papers will undergo a double-blind peer review, where the identities of authors and reviewers are concealed.
Reviewers will evaluate submissions based on criteria like technical quality, novelty, and potential impact.
Authors may be required to address reviewer comments and submit a revised version of their paper.

Final Submission:

After the review process, authors of accepted papers must submit a final version that incorporates any required changes.
Final papers must adhere to the same formatting guidelines as the initial submission.

Presentation:

Authors of accepted papers will be expected to present their work at the ECCV conference, either through a oral presentation or a poster session.
Presentation guidelines and schedules will be provided to authors of accepted papers.

Critical Analysis

The author guidelines appear to be thorough and well-designed to ensure a fair and efficient review process for the ECCV conference. The formatting requirements and submission guidelines are clear and detailed, which should help authors prepare their papers effectively.

However, the guidelines do not provide much insight into the specific criteria used by reviewers to evaluate submissions. While the general evaluation criteria (technical quality, novelty, impact) are mentioned, more detail on how these factors are assessed could be useful for authors.

Additionally, the guidelines do not address potential issues that could arise during the review process, such as conflicts of interest or bias. Providing some information on how such issues are handled could further improve the transparency and fairness of the review process.

Conclusion

The ECCV author guidelines outline the necessary steps and requirements for submitting a paper to the prestigious computer vision conference. By clearly communicating the formatting, submission, and presentation expectations, the guidelines aim to facilitate a rigorous and consistent review process that results in a high-quality program for attendees.

While the guidelines are comprehensive, additional details on the review criteria and process management could further strengthen the submission and review experience for authors. Overall, the guidelines provide a solid foundation for authors seeking to participate in the ECCV conference.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Attention Beats Linear for Fast Implicit Neural Representation Generation

Shuyi Zhang, Ke Liu, Jingjun Gu, Xiaoxu Cai, Zhihua Wang, Jiajun Bu, Haishuai Wang

Implicit Neural Representation (INR) has gained increasing popularity as a data representation method, serving as a prerequisite for innovative generation models. Unlike gradient-based methods, which exhibit lower efficiency in inference, the adoption of hyper-network for generating parameters in Multi-Layer Perceptrons (MLP), responsible for executing INR functions, has surfaced as a promising and efficient alternative. However, as a global continuous function, MLP is challenging in modeling highly discontinuous signals, resulting in slow convergence during the training phase and inaccurate reconstruction performance. Moreover, MLP requires massive representation parameters, which implies inefficiencies in data representation. In this paper, we propose a novel Attention-based Localized INR (ANR) composed of a localized attention layer (LAL) and a global MLP that integrates coordinate features with data features and converts them to meaningful outputs. Subsequently, we design an instance representation framework that delivers a transformer-like hyper-network to represent data instances as a compact representation vector. With instance-specific representation vector and instance-agnostic ANR parameters, the target signals are well reconstructed as a continuous function. We further address aliasing artifacts with variational coordinates when obtaining the super-resolution inference results. Extensive experimentation across four datasets showcases the notable efficacy of our ANR method, e.g. enhancing the PSNR value from 37.95dB to 47.25dB on the CelebA dataset. Code is released at https://github.com/Roninton/ANR.

7/23/2024

Conv-INR: Convolutional Implicit Neural Representation for Multimodal Visual Signals

Zhicheng Cai

Implicit neural representation (INR) has recently emerged as a promising paradigm for signal representations. Typically, INR is parameterized by a multiplayer perceptron (MLP) which takes the coordinates as the inputs and generates corresponding attributes of a signal. However, MLP-based INRs face two critical issues: i) individually considering each coordinate while ignoring the connections; ii) suffering from the spectral bias thus failing to learn high-frequency components. While target visual signals usually exhibit strong local structures and neighborhood dependencies, and high-frequency components are significant in these signals, the issues harm the representational capacity of INRs. This paper proposes Conv-INR, the first INR model fully based on convolution. Due to the inherent attributes of convolution, Conv-INR can simultaneously consider adjacent coordinates and learn high-frequency components effectively. Compared to existing MLP-based INRs, Conv-INR has better representational capacity and trainability without requiring primary function expansion. We conduct extensive experiments on four tasks, including image fitting, CT/MRI reconstruction, and novel view synthesis, Conv-INR all significantly surpasses existing MLP-based INRs, validating the effectiveness. Finally, we raise three reparameterization methods that can further enhance the performance of the vanilla Conv-INR without introducing any extra inference cost.

6/7/2024

Improved Implicit Neural Representation with Fourier Reparameterized Training

Kexuan Shi, Xingyu Zhou, Shuhang Gu

Implicit Neural Representation (INR) as a mighty representation paradigm has achieved success in various computer vision tasks recently. Due to the low-frequency bias issue of vanilla multi-layer perceptron (MLP), existing methods have investigated advanced techniques, such as positional encoding and periodic activation function, to improve the accuracy of INR. In this paper, we connect the network training bias with the reparameterization technique and theoretically prove that weight reparameterization could provide us a chance to alleviate the spectral bias of MLP. Based on our theoretical analysis, we propose a Fourier reparameterization method which learns coefficient matrix of fixed Fourier bases to compose the weights of MLP. We evaluate the proposed Fourier reparameterization method on different INR tasks with various MLP architectures, including vanilla MLP, MLP with positional encoding and MLP with advanced activation function, etc. The superiority approximation results on different MLP architectures clearly validate the advantage of our proposed method. Armed with our Fourier reparameterization method, better INR with more textures and less artifacts can be learned from the training data.

7/8/2024

Towards a Sampling Theory for Implicit Neural Representations

Mahrokh Najaf, Gregory Ongie

Implicit neural representations (INRs) have emerged as a powerful tool for solving inverse problems in computer vision and computational imaging. INRs represent images as continuous domain functions realized by a neural network taking spatial coordinates as inputs. However, unlike traditional pixel representations, little is known about the sample complexity of estimating images using INRs in the context of linear inverse problems. Towards this end, we study the sampling requirements for recovery of a continuous domain image from its low-pass Fourier coefficients by fitting a single hidden-layer INR with ReLU activation and a Fourier features layer using a generalized form of weight decay regularization. Our key insight is to relate minimizers of this non-convex parameter space optimization problem to minimizers of a convex penalty defined over an infinite-dimensional space of measures. We identify a sufficient number of samples for which an image realized by a width-1 INR is exactly recoverable by solving the INR training problem, and give a conjecture for the general width-$W$ case. To validate our theory, we empirically assess the probability of achieving exact recovery of images realized by low-width single hidden-layer INRs, and illustrate the performance of INR on super-resolution recovery of more realistic continuous domain phantom images.

5/29/2024