Granite Code Models: A Family of Open Foundation Models for Code Intelligence

Read original: arXiv:2405.04324 - Published 5/8/2024 by Mayank Mishra, Matt Stallone, Gaoyuan Zhang, Yikang Shen, Aditya Prasad, Adriana Meza Soria, Michele Merler, Parameswaran Selvam, Saptha Surendran, Shivdeep Singh and 36 others

🤔

Overview

Large language models (LLMs) trained on code are revolutionizing software development
Code LLMs are being integrated into development environments to boost human programmer productivity
LLM-based agents are showing promise for handling complex coding tasks autonomously
Realizing the full potential of code LLMs requires a wide range of capabilities like code generation, bug fixing, code explanation, and more

Plain English Explanation

Large language models (LLMs) are artificial intelligence systems that have been trained on massive amounts of text data, allowing them to understand and generate human-like language. Increasingly, these LLMs are being trained specifically on code, the instructions that tell computers what to do.

This is a game-changer for software development. Code LLMs can now be integrated directly into the tools and environments that programmers use every day, boosting their productivity. These models can help with tasks like automatically generating new code, finding and fixing bugs, explaining complex code to teammates, and more.

In fact, LLM-based "agents" are even beginning to demonstrate the ability to handle coding tasks autonomously, without direct human supervision. This could revolutionize the software development process, allowing AIs to take on more of the repetitive and time-consuming aspects of programming.

To realize the full potential of code LLMs, researchers are working to expand their capabilities across a wide range of coding-related skills. This includes not just writing new code, but also understanding and modifying existing code, maintaining software repositories, and a host of other critical programming tasks.

Technical Explanation

In this paper, the authors introduce the Granite series of decoder-only code models, designed for a variety of code generation and related tasks. These models were trained on code written in 116 different programming languages, giving them broad applicability.

The Granite Code models come in a range of sizes, from 3 billion to 34 billion parameters, allowing them to be tailored to different use cases - from complex application modernization down to memory-constrained on-device applications.

The researchers evaluated the Granite models on a comprehensive set of coding-related tasks, and found that they consistently outperformed other available open-source code LLMs in terms of performance. The Granite models were optimized for real-world enterprise software development workflows, excelling at a variety of tasks like code generation, bug fixing, and code explanation.

The authors are releasing all of the Granite Code models under an open-source Apache 2.0 license, making them available for both research and commercial use. This should help drive further innovation and adoption of these powerful AI-powered coding assistants.

Critical Analysis

The research presented in this paper highlights the impressive capabilities of modern code LLMs, and the authors' work on the Granite model family is a significant contribution to the field. However, as with any emerging technology, there are important caveats and areas for further exploration.

For example, the paper does not delve deeply into the potential safety and ethical challenges of deploying these models in real-world software development workflows. Exploring the safety and generalization challenges of large language models is an important area of ongoing research that deserves further attention.

Additionally, while the Granite models demonstrate strong performance on a variety of coding tasks, the paper does not address the potential trade-offs between model size, efficiency, and generating fast code. As these models become more widely adopted, these engineering considerations will be crucial.

Finally, the authors highlight the Granite models' versatility across coding tasks, but the paper does not explore in depth how these models might integrate with existing software engineering workflows and tools to maximize their impact. Deeper integration with code editing environments and other software development tools could unlock even greater productivity gains.

Conclusion

The research presented in this paper demonstrates the significant progress being made in the field of code-focused large language models. The Granite Code model family represents a powerful new set of AI-powered tools that have the potential to revolutionize software development workflows, boosting programmer productivity and enabling increasingly autonomous coding capabilities.

As these models continue to evolve and become more widely adopted, it will be critical to address the safety, efficiency, and integration challenges to ensure they are deployed responsibly and effectively. But the promise of code LLMs like Granite is clear - they are set to become indispensable assets in the toolkits of modern software engineers and developers.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤔

Granite Code Models: A Family of Open Foundation Models for Code Intelligence

Mayank Mishra, Matt Stallone, Gaoyuan Zhang, Yikang Shen, Aditya Prasad, Adriana Meza Soria, Michele Merler, Parameswaran Selvam, Saptha Surendran, Shivdeep Singh, Manish Sethi, Xuan-Hong Dang, Pengyuan Li, Kun-Lung Wu, Syed Zawad, Andrew Coleman, Matthew White, Mark Lewis, Raju Pavuluri, Yan Koyfman, Boris Lublinsky, Maximilien de Bayser, Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal, Yi Zhou, Chris Johnson, Aanchal Goyal, Hima Patel, Yousaf Shah, Petros Zerfos, Heiko Ludwig, Asim Munawar, Maxwell Crouse, Pavan Kapanipathi, Shweta Salaria, Bob Calio, Sophia Wen, Seetharami Seelam, Brian Belgodere, Carlos Fonseca, Amith Singhee, Nirmit Desai, David D. Cox, Ruchir Puri, Rameswar Panda

Large Language Models (LLMs) trained on code are revolutionizing the software development process. Increasingly, code LLMs are being integrated into software development environments to improve the productivity of human programmers, and LLM-based agents are beginning to show promise for handling complex tasks autonomously. Realizing the full potential of code LLMs requires a wide range of capabilities, including code generation, fixing bugs, explaining and documenting code, maintaining repositories, and more. In this work, we introduce the Granite series of decoder-only code models for code generative tasks, trained with code written in 116 programming languages. The Granite Code models family consists of models ranging in size from 3 to 34 billion parameters, suitable for applications ranging from complex application modernization tasks to on-device memory-constrained use cases. Evaluation on a comprehensive set of tasks demonstrates that Granite Code models consistently reaches state-of-the-art performance among available open-source code LLMs. The Granite Code model family was optimized for enterprise software development workflows and performs well across a range of coding tasks (e.g. code generation, fixing and explanation), making it a versatile all around code model. We release all our Granite Code models under an Apache 2.0 license for both research and commercial use.

5/8/2024

Scaling Granite Code Models to 128K Context

Matt Stallone, Vaibhav Saxena, Leonid Karlinsky, Bridget McGinn, Tim Bula, Mayank Mishra, Adriana Meza Soria, Gaoyuan Zhang, Aditya Prasad, Yikang Shen, Saptha Surendran, Shanmukha Guttula, Hima Patel, Parameswaran Selvam, Xuan-Hong Dang, Yan Koyfman, Atin Sood, Rogerio Feris, Nirmit Desai, David D. Cox, Ruchir Puri, Rameswar Panda

This paper introduces long-context Granite code models that support effective context windows of up to 128K tokens. Our solution for scaling context length of Granite 3B/8B code models from 2K/4K to 128K consists of a light-weight continual pretraining by gradually increasing its RoPE base frequency with repository-level file packing and length-upsampled long-context data. Additionally, we also release instruction-tuned models with long-context support which are derived by further finetuning the long context base models on a mix of permissively licensed short and long-context instruction-response pairs. While comparing to the original short-context Granite code models, our long-context models achieve significant improvements on long-context tasks without any noticeable performance degradation on regular code completion benchmarks (e.g., HumanEval). We release all our long-context Granite code models under an Apache 2.0 license for both research and commercial use.

7/19/2024

Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks

Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal, Sadhana Kumaravel, Matthew Stallone, Rameswar Panda, Yara Rizk, GP Bhargav, Maxwell Crouse, Chulaka Gunasekara, Shajith Ikbal, Sachin Joshi, Hima Karanam, Vineet Kumar, Asim Munawar, Sumit Neelam, Dinesh Raghu, Udit Sharma, Adriana Meza Soria, Dheeraj Sreedhar, Praveen Venkateswaran, Merve Unuvar, David Cox, Salim Roukos, Luis Lastras, Pavan Kapanipathi

Large language models (LLMs) have recently shown tremendous promise in serving as the backbone to agentic systems, as demonstrated by their performance in multi-faceted, challenging benchmarks like SWE-Bench and Agent-Bench. However, to realize the true potential of LLMs as autonomous agents, they must learn to identify, call, and interact with external tools and application program interfaces (APIs) to complete complex tasks. These tasks together are termed function calling. Endowing LLMs with function calling abilities leads to a myriad of advantages, such as access to current and domain-specific information in databases and knowledge sources, and the ability to outsource tasks that can be reliably performed by tools, e.g., a Python interpreter or calculator. While there has been significant progress in function calling with LLMs, there is still a dearth of open models that perform on par with proprietary LLMs like GPT, Claude, and Gemini. Therefore, in this work, we introduce the GRANITE-20B-FUNCTIONCALLING model under an Apache 2.0 license. The model is trained using a multi-task training approach on seven fundamental tasks encompassed in function calling, those being Nested Function Calling, Function Chaining, Parallel Functions, Function Name Detection, Parameter-Value Pair Detection, Next-Best Function, and Response Generation. We present a comprehensive evaluation on multiple out-of-domain datasets comparing GRANITE-20B-FUNCTIONCALLING to more than 15 other best proprietary and open models. GRANITE-20B-FUNCTIONCALLING provides the best performance among all open models on the Berkeley Function Calling Leaderboard and fourth overall. As a result of the diverse tasks and datasets used for training our model, we show that GRANITE-20B-FUNCTIONCALLING has better generalizability on multiple tasks in seven different evaluation datasets.

7/2/2024

Meta Large Language Model Compiler: Foundation Models of Compiler Optimization

Chris Cummins, Volker Seeker, Dejan Grubisic, Baptiste Roziere, Jonas Gehring, Gabriel Synnaeve, Hugh Leather

Large Language Models (LLMs) have demonstrated remarkable capabilities across a variety of software engineering and coding tasks. However, their application in the domain of code and compiler optimization remains underexplored. Training LLMs is resource-intensive, requiring substantial GPU hours and extensive data collection, which can be prohibitive. To address this gap, we introduce Meta Large Language Model Compiler (LLM Compiler), a suite of robust, openly available, pre-trained models specifically designed for code optimization tasks. Built on the foundation of Code Llama, LLM Compiler enhances the understanding of compiler intermediate representations (IRs), assembly language, and optimization techniques. The model has been trained on a vast corpus of 546 billion tokens of LLVM-IR and assembly code and has undergone instruction fine-tuning to interpret compiler behavior. LLM Compiler is released under a bespoke commercial license to allow wide reuse and is available in two sizes: 7 billion and 13 billion parameters. We also present fine-tuned versions of the model, demonstrating its enhanced capabilities in optimizing code size and disassembling from x86_64 and ARM assembly back into LLVM-IR. These achieve 77% of the optimising potential of an autotuning search, and 45% disassembly round trip (14% exact match). This release aims to provide a scalable, cost-effective foundation for further research and development in compiler optimization by both academic researchers and industry practitioners.

7/4/2024