Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks

Read original: arXiv:2407.00121 - Published 7/2/2024 by Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal, Sadhana Kumaravel, Matthew Stallone, Rameswar Panda, Yara Rizk, GP Bhargav, Maxwell Crouse, Chulaka Gunasekara and 16 others

Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks

Overview

This paper introduces the Granite-Function Calling Model, a novel approach to enable language models to perform function calling tasks through multi-task learning of granular tasks.
The model aims to expand the capabilities of large language models (LLMs) to go beyond simple text generation and engage in more complex software engineering tasks like function calling.
The paper presents experimental results demonstrating the Granite-Function Calling Model's ability to effectively call functions from a given codebase, highlighting its potential for practical applications in software development.

Plain English Explanation

The researchers have developed a new machine learning model called the Granite-Function Calling Model that can perform function calling tasks. Function calling is an important skill in software engineering, where a program needs to be able to call and use specific functions or pieces of code to accomplish a task.

Traditionally, large language models (LLMs) have been good at generating human-like text, but they haven't been able to effectively engage in more complex software engineering tasks like function calling. The Granite-Function Calling Model addresses this by using a multi-task learning approach, where the model is trained on a variety of smaller, more granular tasks related to function calling.

By breaking down the function calling task into smaller, more manageable components and training the model on these, the researchers have been able to teach the model how to effectively call functions from a given codebase. This means the model can now take a programming task, understand the necessary functions, and call them correctly to complete the task.

The paper presents experimental results that demonstrate the model's effectiveness at function calling, showing that it can outperform other approaches. This suggests the Granite-Function Calling Model could have practical applications in software development, where it could assist human programmers by automating certain function calling tasks or even generating code that calls the right functions to solve a problem.

Technical Explanation

The Granite-Function Calling Model is a novel approach to enable large language models (LLMs) to perform function calling tasks through multi-task learning of granular tasks. This builds on previous work on LLM-based function calling and LLM-as-compiler techniques.

The key idea is to break down the function calling task into smaller, more granular sub-tasks, such as function identification, argument parsing, and return value handling. The model is then trained on these individual sub-tasks using a multi-task learning approach, allowing it to develop a robust understanding of the overall function calling process.

The experimental results presented in the paper demonstrate the effectiveness of the Granite-Function Calling Model, showing that it can outperform other approaches on function calling benchmarks. The model is able to accurately identify the appropriate functions to call, correctly parse the function arguments, and handle the function return values, all of which are critical for successful function calling.

The Octopus architecture, which fuses the language model with a compiler-like module, is a key component that enables the Granite-Function Calling Model to efficiently execute the function calling process.

Critical Analysis

The Granite-Function Calling Model represents a significant step forward in expanding the capabilities of large language models beyond simple text generation. By teaching the model to effectively perform function calling tasks, the researchers have opened up new avenues for LLMs to be applied in software engineering and development.

However, the paper does acknowledge some limitations and areas for further research. For example, the model's performance may be dependent on the quality and coverage of the training data, and it remains to be seen how well the model would generalize to more complex or domain-specific codebases.

Additionally, the paper does not explore the potential security implications of having an LLM-based system capable of executing arbitrary code. There could be risks associated with the model being used to generate malicious or exploitative code, which would need to be carefully considered.

Further research could also investigate ways to make the Granite-Function Calling Model more interpretable and transparent, allowing developers and users to better understand the model's decision-making process and trust its outputs.

Conclusion

The Granite-Function Calling Model represents a significant advance in the field of large language models, demonstrating their potential to go beyond text generation and engage in more complex software engineering tasks. By using a multi-task learning approach to teach the model granular function calling skills, the researchers have opened up new possibilities for LLMs to assist human developers and contribute to the software development process.

While the model has shown promising results, there are still areas for further research and development, particularly around addressing potential security concerns and improving the model's interpretability. Overall, the Granite-Function Calling Model is a compelling step forward in the ongoing efforts to expand the capabilities of large language models and unlock their full potential for real-world applications.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks

Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal, Sadhana Kumaravel, Matthew Stallone, Rameswar Panda, Yara Rizk, GP Bhargav, Maxwell Crouse, Chulaka Gunasekara, Shajith Ikbal, Sachin Joshi, Hima Karanam, Vineet Kumar, Asim Munawar, Sumit Neelam, Dinesh Raghu, Udit Sharma, Adriana Meza Soria, Dheeraj Sreedhar, Praveen Venkateswaran, Merve Unuvar, David Cox, Salim Roukos, Luis Lastras, Pavan Kapanipathi

Large language models (LLMs) have recently shown tremendous promise in serving as the backbone to agentic systems, as demonstrated by their performance in multi-faceted, challenging benchmarks like SWE-Bench and Agent-Bench. However, to realize the true potential of LLMs as autonomous agents, they must learn to identify, call, and interact with external tools and application program interfaces (APIs) to complete complex tasks. These tasks together are termed function calling. Endowing LLMs with function calling abilities leads to a myriad of advantages, such as access to current and domain-specific information in databases and knowledge sources, and the ability to outsource tasks that can be reliably performed by tools, e.g., a Python interpreter or calculator. While there has been significant progress in function calling with LLMs, there is still a dearth of open models that perform on par with proprietary LLMs like GPT, Claude, and Gemini. Therefore, in this work, we introduce the GRANITE-20B-FUNCTIONCALLING model under an Apache 2.0 license. The model is trained using a multi-task training approach on seven fundamental tasks encompassed in function calling, those being Nested Function Calling, Function Chaining, Parallel Functions, Function Name Detection, Parameter-Value Pair Detection, Next-Best Function, and Response Generation. We present a comprehensive evaluation on multiple out-of-domain datasets comparing GRANITE-20B-FUNCTIONCALLING to more than 15 other best proprietary and open models. GRANITE-20B-FUNCTIONCALLING provides the best performance among all open models on the Berkeley Function Calling Leaderboard and fourth overall. As a result of the diverse tasks and datasets used for training our model, we show that GRANITE-20B-FUNCTIONCALLING has better generalizability on multiple tasks in seven different evaluation datasets.

7/2/2024

TinyAgent: Function Calling at the Edge

Lutfi Eren Erdogan, Nicholas Lee, Siddharth Jha, Sehoon Kim, Ryan Tabrizi, Suhong Moon, Coleman Hooper, Gopala Anumanchipalli, Kurt Keutzer, Amir Gholami

Recent large language models (LLMs) have enabled the development of advanced agentic systems that can integrate various tools and APIs to fulfill user queries through function calling. However, the deployment of these LLMs on the edge has not been explored since they typically require cloud-based infrastructure due to their substantial model size and computational demands. To this end, we present TinyAgent, an end-to-end framework for training and deploying task-specific small language model agents capable of function calling for driving agentic systems at the edge. We first show how to enable accurate function calling for open-source models via the LLMCompiler framework. We then systematically curate a high-quality dataset for function calling, which we use to fine-tune two small language models, TinyAgent-1.1B and 7B. For efficient inference, we introduce a novel tool retrieval method to reduce the input prompt length and utilize quantization to further accelerate the inference speed. As a driving application, we demonstrate a local Siri-like system for Apple's MacBook that can execute user commands through text or voice input. Our results show that our models can achieve, and even surpass, the function-calling capabilities of larger models like GPT-4-Turbo, while being fully deployed at the edge. We open-source our dataset, models, and installable package and provide a demo video for our MacBook assistant agent.

9/4/2024

🤔

Granite Code Models: A Family of Open Foundation Models for Code Intelligence

Mayank Mishra, Matt Stallone, Gaoyuan Zhang, Yikang Shen, Aditya Prasad, Adriana Meza Soria, Michele Merler, Parameswaran Selvam, Saptha Surendran, Shivdeep Singh, Manish Sethi, Xuan-Hong Dang, Pengyuan Li, Kun-Lung Wu, Syed Zawad, Andrew Coleman, Matthew White, Mark Lewis, Raju Pavuluri, Yan Koyfman, Boris Lublinsky, Maximilien de Bayser, Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal, Yi Zhou, Chris Johnson, Aanchal Goyal, Hima Patel, Yousaf Shah, Petros Zerfos, Heiko Ludwig, Asim Munawar, Maxwell Crouse, Pavan Kapanipathi, Shweta Salaria, Bob Calio, Sophia Wen, Seetharami Seelam, Brian Belgodere, Carlos Fonseca, Amith Singhee, Nirmit Desai, David D. Cox, Ruchir Puri, Rameswar Panda

Large Language Models (LLMs) trained on code are revolutionizing the software development process. Increasingly, code LLMs are being integrated into software development environments to improve the productivity of human programmers, and LLM-based agents are beginning to show promise for handling complex tasks autonomously. Realizing the full potential of code LLMs requires a wide range of capabilities, including code generation, fixing bugs, explaining and documenting code, maintaining repositories, and more. In this work, we introduce the Granite series of decoder-only code models for code generative tasks, trained with code written in 116 programming languages. The Granite Code models family consists of models ranging in size from 3 to 34 billion parameters, suitable for applications ranging from complex application modernization tasks to on-device memory-constrained use cases. Evaluation on a comprehensive set of tasks demonstrates that Granite Code models consistently reaches state-of-the-art performance among available open-source code LLMs. The Granite Code model family was optimized for enterprise software development workflows and performs well across a range of coding tasks (e.g. code generation, fixing and explanation), making it a versatile all around code model. We release all our Granite Code models under an Apache 2.0 license for both research and commercial use.

5/8/2024

ToolACE: Winning the Points of LLM Function Calling

Weiwen Liu, Xu Huang, Xingshan Zeng, Xinlong Hao, Shuai Yu, Dexun Li, Shuai Wang, Weinan Gan, Zhengying Liu, Yuanqing Yu, Zezhong Wang, Yuxian Wang, Wu Ning, Yutai Hou, Bin Wang, Chuhan Wu, Xinzhi Wang, Yong Liu, Yasheng Wang, Duyu Tang, Dandan Tu, Lifeng Shang, Xin Jiang, Ruiming Tang, Defu Lian, Qun Liu, Enhong Chen

Function calling significantly extends the application boundary of large language models, where high-quality and diverse training data is critical for unlocking this capability. However, real function-calling data is quite challenging to collect and annotate, while synthetic data generated by existing pipelines tends to lack coverage and accuracy. In this paper, we present ToolACE, an automatic agentic pipeline designed to generate accurate, complex, and diverse tool-learning data. ToolACE leverages a novel self-evolution synthesis process to curate a comprehensive API pool of 26,507 diverse APIs. Dialogs are further generated through the interplay among multiple agents, guided by a formalized thinking process. To ensure data accuracy, we implement a dual-layer verification system combining rule-based and model-based checks. We demonstrate that models trained on our synthesized data, even with only 8B parameters, achieve state-of-the-art performance on the Berkeley Function-Calling Leaderboard, rivaling the latest GPT-4 models. Our model and a subset of the data are publicly available at https://huggingface.co/Team-ACE.

9/4/2024