GRASP-GCN: Graph-Shape Prioritization for Neural Architecture Search under Distribution Shifts

Read original: arXiv:2405.06994 - Published 5/14/2024 by Sofia Casarin, Oswald Lanz, Sergio Escalera

🧠

Overview

Neural Architecture Search (NAS) methods can create networks that outperform human-designed ones
Conventional NAS methods are computationally expensive, as they need to be rerun for each new dataset
This paper focuses on improving prediction performance of predictor-based NAS algorithms when dealing with data distribution shifts

Plain English Explanation

Neural networks are powerful machine learning models that can excel at a variety of tasks. Neural Architecture Search (NAS) is a technique that can automatically design these neural networks, often producing models that outperform ones designed by human experts.

However, traditional NAS methods have a major downside - they are computationally expensive, as the entire search process needs to be rerun from scratch for each new dataset. This makes them impractical for many real-world applications.

This paper presents a new approach to address this issue. The researchers focus on a specific type of NAS algorithm called "predictor-based" algorithms. These algorithms try to predict the performance of a neural network design without fully training it, which can save a lot of time.

The key innovation in this paper is a way to improve the generalization of these predictor-based algorithms, so they can work well even when the data distribution changes between datasets. The researchers create a special "benchmark" dataset composed of neural networks trained on multiple datasets, and then train a graph convolutional network that can better predict network performance under distribution shifts.

This new "GRASP-GCN" model is able to outperform previous state-of-the-art predictors, especially when dealing with different datasets. This is an important step towards making NAS methods more practical and widely usable.

Technical Explanation

The paper proposes a new approach for improving the performance of predictor-based Neural Architecture Search (NAS) algorithms, particularly when dealing with data distribution shifts between datasets.

The researchers first create a small NAS benchmark composed of neural networks trained on four different datasets. This is done by exploiting the Kronecker-product on a randomly wired search-space.

To improve the generalization abilities of the predictor, the paper introduces GRASP-GCN - a ranking Graph Convolutional Network that takes the shape of the neural network layers as additional input. GRASP-GCN is trained using the "not-at-convergence" accuracies of the networks in the benchmark.

Experiments show that GRASP-GCN is able to outperform previous state-of-the-art predictors by 3.3% on the CIFAR-10 dataset. More importantly, it also demonstrates improved generalization abilities when dealing with data distribution shifts.

Critical Analysis

The paper presents a novel approach for improving the generalization of predictor-based NAS algorithms, which is an important step towards making these methods more practical and widely applicable.

One key limitation is that the proposed benchmark dataset is relatively small, containing only networks trained on four datasets. While this demonstrates the concept, it remains to be seen how well the approach would scale to a larger, more diverse set of datasets and network architectures.

Additionally, the paper does not provide much insight into why the proposed GRASP-GCN model is effective at dealing with distribution shifts. Further analysis of the model's inner workings and design choices would help build a deeper understanding of the approach.

It would also be valuable to see how GRASP-GCN compares to other techniques for improving the robustness of graph neural networks, such as conformal prediction or improved graph pooling. Combining these approaches could potentially lead to even stronger performance.

Overall, this paper presents a promising direction for making predictor-based NAS more practical and widely applicable. Further research building on these ideas could have significant impacts on the field of automated neural network design.

Conclusion

This paper introduces a novel approach for improving the performance of predictor-based Neural Architecture Search (NAS) algorithms, particularly when dealing with data distribution shifts.

The key contributions are:

Creating a small NAS benchmark dataset composed of networks trained on multiple datasets
Proposing GRASP-GCN, a Graph Convolutional Network that can better predict network performance under distribution shifts

Experiments show that GRASP-GCN outperforms previous state-of-the-art predictors, increasing performance on CIFAR-10 by 3.3% and demonstrating improved generalization abilities.

This work represents an important step towards making NAS methods more practical and applicable in real-world scenarios where data distributions may vary. Further research building on these ideas could significantly advance the field of automated neural network design.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🧠

GRASP-GCN: Graph-Shape Prioritization for Neural Architecture Search under Distribution Shifts

Sofia Casarin, Oswald Lanz, Sergio Escalera

Neural Architecture Search (NAS) methods have shown to output networks that largely outperform human-designed networks. However, conventional NAS methods have mostly tackled the single dataset scenario, incuring in a large computational cost as the procedure has to be run from scratch for every new dataset. In this work, we focus on predictor-based algorithms and propose a simple and efficient way of improving their prediction performance when dealing with data distribution shifts. We exploit the Kronecker-product on the randomly wired search-space and create a small NAS benchmark composed of networks trained over four different datasets. To improve the generalization abilities, we propose GRASP-GCN, a ranking Graph Convolutional Network that takes as additional input the shape of the layers of the neural networks. GRASP-GCN is trained with the not-at-convergence accuracies, and improves the state-of-the-art of 3.3 % for Cifar-10 and increasing moreover the generalization abilities under data distribution shift.

5/14/2024

Causal-Aware Graph Neural Architecture Search under Distribution Shifts

Peiwen Li, Xin Wang, Zeyang Zhang, Yijian Qin, Ziwei Zhang, Jialong Wang, Yang Li, Wenwu Zhu

Graph NAS has emerged as a promising approach for autonomously designing GNN architectures by leveraging the correlations between graphs and architectures. Existing methods fail to generalize under distribution shifts that are ubiquitous in real-world graph scenarios, mainly because the graph-architecture correlations they exploit might be spurious and varying across distributions. We propose to handle the distribution shifts in the graph architecture search process by discovering and exploiting the causal relationship between graphs and architectures to search for the optimal architectures that can generalize under distribution shifts. The problem remains unexplored with following challenges: how to discover the causal graph-architecture relationship that has stable predictive abilities across distributions, and how to handle distribution shifts with the discovered causal graph-architecture relationship to search the generalized graph architectures. To address these challenges, we propose Causal-aware Graph Neural Architecture Search (CARNAS), which is able to capture the causal graph-architecture relationship during the architecture search process and discover the generalized graph architecture under distribution shifts. Specifically, we propose Disentangled Causal Subgraph Identification to capture the causal subgraphs that have stable prediction abilities across distributions. Then, we propose Graph Embedding Intervention to intervene on causal subgraphs within the latent space, ensuring that these subgraphs encapsulate essential features for prediction while excluding non-causal elements. Additionally, we propose Invariant Architecture Customization to reinforce the causal invariant nature of the causal subgraphs, which are utilized to tailor generalized graph architectures. Extensive experiments demonstrate that CARNAS achieves advanced out-of-distribution generalization ability.

5/28/2024

Graph is all you need? Lightweight data-agnostic neural architecture search without training

Zhenhan Huang, Tejaswini Pedapati, Pin-Yu Chen, Chunhen Jiang, Jianxi Gao

Neural architecture search (NAS) enables the automatic design of neural network models. However, training the candidates generated by the search algorithm for performance evaluation incurs considerable computational overhead. Our method, dubbed nasgraph, remarkably reduces the computational costs by converting neural architectures to graphs and using the average degree, a graph measure, as the proxy in lieu of the evaluation metric. Our training-free NAS method is data-agnostic and light-weight. It can find the best architecture among 200 randomly sampled architectures from NAS-Bench201 in 217 CPU seconds. Besides, our method is able to achieve competitive performance on various datasets including NASBench-101, NASBench-201, and NDS search spaces. We also demonstrate that nasgraph generalizes to more challenging tasks on Micro TransNAS-Bench-101.

5/3/2024

Towards Lightweight Graph Neural Network Search with Curriculum Graph Sparsification

Beini Xie, Heng Chang, Ziwei Zhang, Zeyang Zhang, Simin Wu, Xin Wang, Yuan Meng, Wenwu Zhu

Graph Neural Architecture Search (GNAS) has achieved superior performance on various graph-structured tasks. However, existing GNAS studies overlook the applications of GNAS in resource-constraint scenarios. This paper proposes to design a joint graph data and architecture mechanism, which identifies important sub-architectures via the valuable graph data. To search for optimal lightweight Graph Neural Networks (GNNs), we propose a Lightweight Graph Neural Architecture Search with Graph SparsIfication and Network Pruning (GASSIP) method. In particular, GASSIP comprises an operation-pruned architecture search module to enable efficient lightweight GNN search. Meanwhile, we design a novel curriculum graph data sparsification module with an architecture-aware edge-removing difficulty measurement to help select optimal sub-architectures. With the aid of two differentiable masks, we iteratively optimize these two modules to efficiently search for the optimal lightweight architecture. Extensive experiments on five benchmarks demonstrate the effectiveness of GASSIP. Particularly, our method achieves on-par or even higher node classification performance with half or fewer model parameters of searched GNNs and a sparser graph.

6/26/2024