Data-Efficient Learning with Neural Programs

2406.06246

Published 6/11/2024 by Alaia Solko-Breslin, Seewon Choi, Ziyang Li, Neelay Velingker, Rajeev Alur, Mayur Naik, Eric Wong

cs.LG

Data-Efficient Learning with Neural Programs

Abstract

Many computational tasks can be naturally expressed as a composition of a DNN followed by a program written in a traditional programming language or an API call to an LLM. We call such composites neural programs and focus on the problem of learning the DNN parameters when the training data consist of end-to-end input-output labels for the composite. When the program is written in a differentiable logic programming language, techniques from neurosymbolic learning are applicable, but in general, the learning for neural programs requires estimating the gradients of black-box components. We present an algorithm for learning neural programs, called ISED, that only relies on input-output samples of black-box components. For evaluation, we introduce new benchmarks that involve calls to modern LLMs such as GPT-4 and also consider benchmarks from the neurosymolic learning literature. Our evaluation shows that for the latter benchmarks, ISED has comparable performance to state-of-the-art neurosymbolic frameworks. For the former, we use adaptations of prior work on gradient approximations of black-box components as a baseline, and show that ISED achieves comparable accuracy but in a more data- and sample-efficient manner.

Create account to get full access

Overview

This paper introduces a novel approach called "Neural Programs" for data-efficient learning.
Neural Programs combine the representational power of neural networks with the structured and modular nature of programs.
The paper demonstrates how Neural Programs can learn complex tasks from fewer training examples compared to traditional neural network approaches.

Plain English Explanation

Neural networks are powerful machine learning models that can learn to perform a wide variety of tasks. However, they often require large amounts of training data to achieve good performance. In contrast, humans can often learn new concepts and skills from just a few examples.

The key insight of this paper is that by incorporating program-like structures into neural networks, we can create models that are more data-efficient. Neural Programs combine the flexibility of neural networks with the structured and modular nature of computer programs. This allows them to learn complex tasks from fewer training examples.

For example, imagine you wanted to teach an AI system how to add two numbers together. A traditional neural network might require thousands of examples of addition problems to learn this skill. In contrast, a Neural Program could learn the underlying algorithm for addition from just a handful of examples, and then apply that algorithm to solve new addition problems.

The authors demonstrate the effectiveness of Neural Programs across a range of tasks, including learning to generate visual programs, compositional generalization, and neuro-symbolic inference. By combining the strengths of neural networks and programs, Neural Programs offer a promising path towards more data-efficient and interpretable machine learning.

Technical Explanation

The key innovation of this paper is the Neural Program architecture, which combines neural networks with program-like structures. Neural Programs consist of a neural encoder that maps inputs to a latent program representation, a program executor that interprets and executes the latent program, and a neural decoder that generates the final output.

The authors demonstrate how this architecture can be trained end-to-end using gradient-based optimization. During training, the model learns to construct the appropriate program structure to solve each task, while the neural components learn to efficiently execute those programs.

Experiments show that Neural Programs can achieve strong performance on a variety of tasks, including learning to generate visual programs, compositional generalization, and neuro-symbolic inference, while requiring significantly less training data than traditional neural network approaches.

Critical Analysis

The authors acknowledge several limitations and areas for future work. One key challenge is scaling Neural Programs to handle more complex programs and tasks. The current architecture may struggle with highly nested or recursive program structures.

Additionally, the paper does not extensively explore the interpretability and explainability of the learned Neural Programs. While the modular structure offers potential benefits, more work is needed to ensure the programs are transparent and understandable to human users.

Overall, this paper represents an exciting step towards more data-efficient and interpretable machine learning. By combining the strengths of neural networks and programs, Neural Programs offer a promising path forward. However, significant research is still needed to fully realize the potential of this approach.

Conclusion

This paper introduces the concept of Neural Programs, a novel architecture that combines the representational power of neural networks with the structured and modular nature of computer programs. The authors demonstrate how Neural Programs can learn complex tasks from fewer training examples compared to traditional neural network approaches.

The key innovation is the ability of Neural Programs to construct appropriate program structures to solve each task, while the neural components learn to efficiently execute those programs. Experiments show the effectiveness of this approach across a range of applications, including learning to generate visual programs, compositional generalization, and neuro-symbolic inference.

While the paper highlights several promising directions, more research is needed to address the challenges of scaling Neural Programs to handle more complex tasks and improving the interpretability of the learned program structures. Nevertheless, this work represents an important step towards the development of more data-efficient and interpretable machine learning systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Cross-Modality Program Representation Learning for Electronic Design Automation with High-Level Synthesis

Zongyue Qin, Yunsheng Bai, Atefeh Sograbizadeh, Zijian Ding, Ziniu Hu, Yizhou Sun, Jason Cong

In recent years, domain-specific accelerators (DSAs) have gained popularity for applications such as deep learning and autonomous driving. To facilitate DSA designs, programmers use high-level synthesis (HLS) to compile a high-level description written in C/C++ into a design with low-level hardware description languages that eventually synthesize DSAs on circuits. However, creating a high-quality HLS design still demands significant domain knowledge, particularly in microarchitecture decisions expressed as textit{pragmas}. Thus, it is desirable to automate such decisions with the help of machine learning for predicting the quality of HLS designs, requiring a deeper understanding of the program that consists of original code and pragmas. Naturally, these programs can be considered as sequence data. In addition, these programs can be compiled and converted into a control data flow graph (CDFG). But existing works either fail to leverage both modalities or combine the two in shallow or coarse ways. We propose ProgSG, a model that allows interaction between the source code sequence modality and the graph modality in a deep and fine-grained way. To alleviate the scarcity of labeled designs, a pre-training method is proposed based on a suite of compiler's data flow analysis tasks. Experimental results show that ProgSG reduces the RMSE of design performance predictions by up to $22%$, and identifies designs with an average of $1.10times$ and $1.26times$ (up to $8.17times$ and $13.31times$) performance improvement in design space exploration (DSE) task compared to HARP and AutoDSE, respectively.

6/17/2024

cs.LG cs.AI cs.AR

HYSYNTH: Context-Free LLM Approximation for Guiding Program Synthesis

Shraddha Barke, Emmanuel Anaya Gonzalez, Saketh Ram Kasibatla, Taylor Berg-Kirkpatrick, Nadia Polikarpova

Many structured prediction and reasoning tasks can be framed as program synthesis problems, where the goal is to generate a program in a domain-specific language (DSL) that transforms input data into the desired output. Unfortunately, purely neural approaches, such as large language models (LLMs), often fail to produce fully correct programs in unfamiliar DSLs, while purely symbolic methods based on combinatorial search scale poorly to complex problems. Motivated by these limitations, we introduce a hybrid approach, where LLM completions for a given task are used to learn a task-specific, context-free surrogate model, which is then used to guide program synthesis. We evaluate this hybrid approach on three domains, and show that it outperforms both unguided search and direct sampling from LLMs, as well as existing program synthesizers.

5/28/2024

cs.PL cs.AI

Learning to Infer Generative Template Programs for Visual Concepts

R. Kenny Jones, Siddhartha Chaudhuri, Daniel Ritchie

People grasp flexible visual concepts from a few examples. We explore a neurosymbolic system that learns how to infer programs that capture visual concepts in a domain-general fashion. We introduce Template Programs: programmatic expressions from a domain-specific language that specify structural and parametric patterns common to an input concept. Our framework supports multiple concept-related tasks, including few-shot generation and co-segmentation through parsing. We develop a learning paradigm that allows us to train networks that infer Template Programs directly from visual datasets that contain concept groupings. We run experiments across multiple visual domains: 2D layouts, Omniglot characters, and 3D shapes. We find that our method outperforms task-specific alternatives, and performs competitively against domain-specific approaches for the limited domains where they exist.

6/11/2024

cs.CV cs.AI cs.GR cs.LG

New!Imperative Learning: A Self-supervised Neural-Symbolic Learning Framework for Robot Autonomy

Chen Wang, Kaiyi Ji, Junyi Geng, Zhongqiang Ren, Taimeng Fu, Fan Yang, Yifan Guo, Haonan He, Xiangyu Chen, Zitong Zhan, Qiwei Du, Shaoshu Su, Bowen Li, Yuheng Qiu, Yi Du, Qihang Li, Yifan Yang, Xiao Lin, Zhipeng Zhao

Data-driven methods such as reinforcement and imitation learning have achieved remarkable success in robot autonomy. However, their data-centric nature still hinders them from generalizing well to ever-changing environments. Moreover, collecting large datasets for robotic tasks is often impractical and expensive. To overcome these challenges, we introduce a new self-supervised neural-symbolic (NeSy) computational framework, imperative learning (IL), for robot autonomy, leveraging the generalization abilities of symbolic reasoning. The framework of IL consists of three primary components: a neural module, a reasoning engine, and a memory system. We formulate IL as a special bilevel optimization (BLO), which enables reciprocal learning over the three modules. This overcomes the label-intensive obstacles associated with data-driven approaches and takes advantage of symbolic reasoning concerning logical reasoning, physical principles, geometric analysis, etc. We discuss several optimization techniques for IL and verify their effectiveness in five distinct robot autonomy tasks including path planning, rule induction, optimal control, visual odometry, and multi-robot routing. Through various experiments, we show that IL can significantly enhance robot autonomy capabilities and we anticipate that it will catalyze further research across diverse domains.

6/26/2024

cs.RO cs.AI cs.CV cs.LG