English
Related papers

Related papers: API Pack: A Massive Multi-Programming Language Dat…

200 papers

Recent research has demonstrated that Large Language Models (LLMs) can enhance their capabilities by utilizing external tools. However, three pivotal questions remain unanswered: (1) How effective are current LLMs in utilizing tools? (2)…

Computation and Language · Computer Science 2023-10-26 Minghao Li , Yingxiu Zhao , Bowen Yu , Feifan Song , Hangyu Li , Haiyang Yu , Zhoujun Li , Fei Huang , Yongbin Li

Large language models (LLMs) like GitHub Copilot and ChatGPT have emerged as powerful tools for code generation, significantly enhancing productivity and accelerating software development. However, existing benchmarks primarily focus on…

Software Engineering · Computer Science 2024-09-27 Yixi Wu , Pengfei He , Zehao Wang , Shaowei Wang , Yuan Tian , Tse-Hsun Chen

The proliferation of Large Language Models like ChatGPT has significantly advanced language understanding and generation, impacting a broad spectrum of applications. However, these models predominantly excel in text-based tasks, overlooking…

Computation and Language · Computer Science 2023-11-23 Xiao Liu , Jianfeng Lin , Jiawei Zhang

There is a growing need for Large Language Models (LLMs) to effectively use tools and external Application Programming Interfaces (APIs) to plan and complete tasks. As such, there is tremendous interest in methods that can acquire…

The advancement of function-calling agent models requires diverse, reliable, and high-quality datasets. This paper presents APIGen, an automated data generation pipeline designed to synthesize verifiable high-quality datasets for…

Despite the advancements of open-source large language models (LLMs), e.g., LLaMA, they remain significantly limited in tool-use capabilities, i.e., using external tools (APIs) to fulfill human instructions. The reason is that current…

As large language models (LLMs) like OpenAI's GPT series continue to make strides, we witness the emergence of artificial intelligence applications in an ever-expanding range of fields. In medicine, these LLMs hold considerable promise for…

As Large Language Models (LLMs) advance in natural language processing, there is growing interest in leveraging their capabilities to simplify software interactions. In this paper, we propose a novel system that integrates LLMs for both…

Computation and Language · Computer Science 2024-09-19 Chunliang Tao , Xiaojing Fan , Yahe Yang

Finetuning large language models (LLMs) on instructions leads to vast performance improvements on natural language tasks. We apply instruction tuning using code, leveraging the natural structure of Git commits, which pair code changes with…

Recent advancements in Large Language Models (LLMs) and their utilization in code generation tasks have significantly reshaped the field of software development. Despite the remarkable efficacy of code completion solutions in mainstream…

Software Engineering · Computer Science 2024-06-12 Bohdan Petryshyn , Mantas Lukoševičius

Programmers often search for usage examples for API methods. A tool that could generate realistic, idiomatic, and contextual usage examples for one or more APIs would be immensely beneficial to developers. Such a tool would relieve the need…

Software Engineering · Computer Science 2023-12-27 Manish Shetty , Koushik Sen , Ion Stoica

At the beginning era of large language model, it is quite critical to generate a high-quality financial dataset to fine-tune a large language model for financial related tasks. Thus, this paper presents a carefully designed data creation…

Computation and Language · Computer Science 2023-08-04 Ziao Wang , Jianning Wang , Junda Wu , Xiaofeng Zhang

Fine-tuning on instruction data has been widely validated as an effective practice for implementing chat language models like ChatGPT. Scaling the diversity and quality of such data, although straightforward, stands a great chance of…

Computation and Language · Computer Science 2023-05-24 Ning Ding , Yulin Chen , Bokai Xu , Yujia Qin , Zhi Zheng , Shengding Hu , Zhiyuan Liu , Maosong Sun , Bowen Zhou

In 2022, with the release of ChatGPT, large-scale language models gained widespread attention. ChatGPT not only surpassed previous models in terms of parameters and the scale of its pretraining corpus but also achieved revolutionary…

Artificial Intelligence · Computer Science 2024-11-13 Yiming Ju , Huanhuan Ma

The use of generative AI-based coding assistants like ChatGPT and Github Copilot is a reality in contemporary software development. Many of these tools are provided as remote APIs. Using third-party APIs raises data privacy and security…

Software Engineering · Computer Science 2025-01-20 Negar Alizadeh , Boris Belchev , Nishant Saurabh , Patricia Kelbert , Fernando Castor

Large pre-trained language models such as GPT-3, Codex, and Google's language model are now capable of generating code from natural language specifications of programmer intent. We view these developments with a mixture of optimism and…

Software Engineering · Computer Science 2021-12-07 Naman Jain , Skanda Vaidyanath , Arun Iyer , Nagarajan Natarajan , Suresh Parthasarathy , Sriram Rajamani , Rahul Sharma

APIs play a pivotal role in modern software development by enabling seamless communication and integration between various systems, applications, and services. Component-based API synthesis is a form of program synthesis that constructs an…

Software Engineering · Computer Science 2025-02-24 Hua Zhong , Shan Jiang , Sarfraz Khurshid

API integration is a cornerstone of our digital infrastructure, enabling software systems to connect and interact. However, as shown by many studies, writing or generating correct code to invoke APIs, particularly web APIs, is challenging.…

Software Engineering · Computer Science 2025-12-19 Daniel Maninger , Leon Chemnitz , Amir Molzam Sharifloo , Tushar Lamba , Jannis Brugger , Mira Mezini

We present The Vault, a dataset of high-quality code-text pairs in multiple programming languages for training large language models to understand and generate code. We present methods for thoroughly extracting samples that use both…

Computation and Language · Computer Science 2023-10-31 Dung Nguyen Manh , Nam Le Hai , Anh T. V. Dau , Anh Minh Nguyen , Khanh Nghiem , Jin Guo , Nghi D. Q. Bui

Automatically generating source code from natural language descriptions has been a growing field of research in recent years. However, current large-scale code generation models often encounter difficulties when selecting appropriate APIs…

Software Engineering · Computer Science 2023-09-12 Kechi Zhang , Huangzhao Zhang , Ge Li , Jia Li , Zhuo Li , Zhi Jin
‹ Prev 1 2 3 10 Next ›