Full Line Code Completion: Bringing AI to Desktop

Read original: arXiv:2405.08704 - Published 5/15/2024 by Anton Semenkin, Vitaliy Bibaev, Yaroslav Sokolov, Kirill Krylov, Alexey Kalina, Anna Khannanova, Danila Savenkov, Darya Rovdo, Igor Davidenko, Kirill Karnaukhov and 7 others

🤖

Overview

The paper describes an approach for building a multi-token code completion feature for the JetBrains' IntelliJ Platform, called Full Line Code Completion.
The feature suggests only syntactically correct code and works fully locally, with data querying and suggestion generation happening on the end user's machine.
The authors share important time and memory-consumption restrictions, as well as design principles for a code completion engine.
The solution was initially developed with the help of researchers and was bundled into two JetBrains' IDEs - PyCharm Pro and DataSpell - at the end of 2023.

Plain English Explanation

The paper discusses a new feature for code editors that can automatically complete entire lines of code, rather than just individual words or phrases. This "Full Line Code Completion" feature is designed to work entirely on the user's own computer, without needing to send data to a remote server. This is different from some other industrial solutions for code completion, which rely on cloud-based computing power.

The key goal of this feature is to provide a fast, efficient, and secure code completion experience for developers. The authors share some important constraints they had to work within, such as limits on the time and memory the feature can use. They also describe the design principles they followed to create a code completion engine that meets these requirements.

One of the main benefits of this approach is that it can suggest only code that is syntactically correct, meaning it will fit properly within the user's program. This helps to streamline the coding process and reduce errors. The authors also describe techniques they used to meet their development goals, as well as ways they evaluated the performance of the system.

Overall, this work represents an effort to bridge the gap between academic research and real-world software development, by taking a complex, research-based solution and integrating it into widely-used commercial products.

Technical Explanation

The key innovation described in the paper is the development of a "Full Line Code Completion" feature for the JetBrains IntelliJ Platform. This feature is designed to suggest complete, syntactically correct code lines to users, rather than just individual words or phrases.

A major focus of the work was ensuring that this code completion functionality could run entirely on the user's local machine, without requiring any data to be sent to a remote server. This presented some significant technical challenges in terms of time and memory constraints.

To address these challenges, the authors describe a number of techniques they used in the design of their code completion engine. This includes principles around maintaining fast response times, minimizing memory usage, and ensuring the security and privacy of user data.

The authors also discuss the offline and online evaluation pipelines they developed to test and refine their system. This included measuring metrics like the increase in code production from using the tool, as well as qualitative feedback from users.

Ultimately, the described solution was integrated into two JetBrains IDEs - PyCharm Pro and DataSpell - demonstrating the successful translation of academic research into a commercial product.

Critical Analysis

The paper does a good job of highlighting the key technical constraints and design principles that guided the development of the Full Line Code Completion feature. By focusing on local, on-device processing, the authors were able to create a code completion system that is fast, efficient, and secure for end users.

However, the paper does not delve deeply into potential limitations or areas for further research. For example, it would be interesting to know how the performance of this local approach compares to cloud-based solutions in terms of accuracy or suggestion quality. [Additionally, the authors do not discuss how their techniques for improving performance and reducing memory usage might apply to other types of code editing tools or features.](https://aimodels.fyi/papers/arxiv/learning-performance-improving-code-edits)

Overall, the work represents an important step in bridging the gap between academic research and real-world software development. By tackling the practical challenges of deploying a complex, research-based solution, the authors have generated insights that could benefit both the research and industry communities.

Conclusion

This paper describes the development of a "Full Line Code Completion" feature for the JetBrains IntelliJ Platform, which provides users with suggestions for complete, syntactically correct code lines. A key focus of the work was ensuring that this functionality could run entirely on the user's local machine, without requiring any data to be sent to a remote server.

The authors share important technical constraints and design principles that guided their work, highlighting their efforts to create a fast, efficient, and secure code completion engine. By integrating this research-based solution into commercial JetBrains IDEs, the authors have demonstrated a successful approach for bridging the gap between academia and industry.

While the paper does not delve deeply into potential limitations or areas for further research, it represents an important step forward in the development of powerful, user-friendly code editing tools. The insights and techniques described in this work could have broader applicability in the field of code editing and programming assistance.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤖

Full Line Code Completion: Bringing AI to Desktop

Anton Semenkin, Vitaliy Bibaev, Yaroslav Sokolov, Kirill Krylov, Alexey Kalina, Anna Khannanova, Danila Savenkov, Darya Rovdo, Igor Davidenko, Kirill Karnaukhov, Maxim Vakhrushev, Mikhail Kostyukov, Mikhail Podvitskii, Petr Surkov, Yaroslav Golubev, Nikita Povarov, Timofey Bryksin

In recent years, several industrial solutions for the problem of multi-token code completion have appeared, each making a great advance in the area but mostly focusing on cloud-based runtime and avoiding working on the end user's device. In this work, we describe our approach for building a multi-token code completion feature for the JetBrains' IntelliJ Platform, which we call Full Line Code Completion. The feature suggests only syntactically correct code and works fully locally, i.e., data querying and the generation of suggestions happens on the end user's machine. We share important time and memory-consumption restrictions, as well as design principles that a code completion engine should satisfy. Working entirely on the end user's device, our code completion engine enriches user experience while being not only fast and compact but also secure. We share a number of useful techniques to meet the stated development constraints and also describe offline and online evaluation pipelines that allowed us to make better decisions. Our online evaluation shows that the usage of the tool leads to 1.5 times more code in the IDE being produced by code completion. The described solution was initially started with the help of researchers and was bundled into two JetBrains' IDEs - PyCharm Pro and DataSpell - at the end of 2023, so we believe that this work is useful for bridging academia and industry, providing researchers with the knowledge of what happens when complex research-based solutions are integrated into real products.

5/15/2024

🛠️

A Transformer-Based Approach for Smart Invocation of Automatic Code Completion

Aral de Moor, Arie van Deursen, Maliheh Izadi

Transformer-based language models are highly effective for code completion, with much research dedicated to enhancing the content of these completions. Despite their effectiveness, these models come with high operational costs and can be intrusive, especially when they suggest too often and interrupt developers who are concentrating on their work. Current research largely overlooks how these models interact with developers in practice and neglects to address when a developer should receive completion suggestions. To tackle this issue, we developed a machine learning model that can accurately predict when to invoke a code completion tool given the code context and available telemetry data. To do so, we collect a dataset of 200k developer interactions with our cross-IDE code completion plugin and train several invocation filtering models. Our results indicate that our small-scale transformer model significantly outperforms the baseline while maintaining low enough latency. We further explore the search space for integrating additional telemetry data into a pre-trained transformer directly and obtain promising results. To further demonstrate our approach's practical potential, we deployed the model in an online environment with 34 developers and provided real-world insights based on 74k actual invocations.

5/24/2024

Optimizing Large Language Models for OpenAPI Code Completion

Bohdan Petryshyn, Mantas Lukov{s}eviv{c}ius

Recent advancements in Large Language Models (LLMs) and their utilization in code generation tasks have significantly reshaped the field of software development. Despite the remarkable efficacy of code completion solutions in mainstream programming languages, their performance lags when applied to less ubiquitous formats such as OpenAPI definitions. This study evaluates the OpenAPI completion performance of GitHub Copilot, a prevalent commercial code completion tool, and proposes a set of task-specific optimizations leveraging Meta's open-source model Code Llama. A semantics-aware OpenAPI completion benchmark proposed in this research is used to perform a series of experiments through which the impact of various prompt-engineering and fine-tuning techniques on the Code Llama model's performance is analyzed. The fine-tuned Code Llama model reaches a peak correctness improvement of 55.2% over GitHub Copilot despite utilizing 25 times fewer parameters than the commercial solution's underlying Codex model. Additionally, this research proposes an enhancement to a widely used code infilling training technique, addressing the issue of underperformance when the model is prompted with context sizes smaller than those used during training. The dataset, the benchmark, and the model fine-tuning code are made publicly available.

6/12/2024

🧠

Don't Complete It! Preventing Unhelpful Code Completion for Productive and Sustainable Neural Code Completion Systems

Zhensu Sun, Xiaoning Du, Fu Song, Shangwen Wang, Mingze Ni, Li Li, David Lo

Currently, large pre-trained language models are widely applied in neural code completion systems. Though large code models significantly outperform their smaller counterparts, around 70% of displayed code completions from Github Copilot are not accepted by developers. Being reviewed but not accepted, their help to developer productivity is considerably limited and may conversely aggravate the workload of developers, as the code completions are automatically and actively generated in state-of-the-art code completion systems as developers type out once the service is enabled. Even worse, considering the high cost of the large code models, it is a huge waste of computing resources and energy, which severely goes against the sustainable development principle of AI technologies. However, such waste has never been realized, not to mention effectively addressed, in the research community for neural code completion. Hence, preventing such unhelpful code completions from happening in a cost-friendly way is of urgent need. To fill this significant gap, we first investigate the prompts of unhelpful code completions, called low-return prompts. We empirically identify four observable patterns in low-return prompts, each lacking necessary information, making it difficult to address through enhancements to the model's accuracy alone. This demonstrates the feasibility of identifying such low-return prompts based on the prompts themselves. Motivated by this finding, we propose an early-rejection mechanism to turn down low-return prompts by foretelling the code completion qualities. The prompts that are estimated to receive unhelpful code completions will not be sent to the model. Furthermore, we investigated five types of estimators to demonstrate the feasibility of the mechanism. The experimental results show that the estimator can reject 20% of code completion requests with a 97.4% Precision.

8/12/2024