English
Related papers

Related papers: From Code to Play: Benchmarking Program Search for…

200 papers

In recent years, large language models (LLMs) have emerged as powerful tools with potential applications in various fields, including software engineering. Within the scope of this research, we evaluate five different state-of-the-art LLMs…

Computation and Language · Computer Science 2024-09-09 Luis Mayer , Christian Heumann , Matthias Aßenmacher

Large language models (LLMs) have taken the scientific world by storm, changing the landscape of natural language processing and human-computer interaction. These powerful tools can answer complex questions and, surprisingly, perform…

Artificial Intelligence · Computer Science 2023-11-14 Pier Luca Lanzi , Daniele Loiacono

In this paper, we propose the use of the popular word-based board game Codenames as a suitable benchmark for evaluating the reasoning capabilities of Large Language Models (LLMs). Codenames presents a highly interesting challenge for…

Artificial Intelligence · Computer Science 2025-04-23 Matthew Stephenson , Matthew Sidji , Benoît Ronval

Large Language Models (LLMs) reasoning abilities are increasingly being applied to classical board and card games, but the dominant approach -- involving prompting for direct move generation -- has significant drawbacks. It relies on the…

Large Language Models (LLMs) can generate code, but can they generate fast code for complex, real-world software systems? In this study, we investigate this question using a dataset of 65 tasks mined from performance-critical open-source…

Software Engineering · Computer Science 2026-04-10 Lirong Yi , Gregory Gay , Philipp Leitner

We introduce a novel and extensible benchmark for large language models (LLMs) through grid-based games such as Tic-Tac-Toe, Connect Four, and Gomoku. The open-source game simulation code, available on GitHub, allows LLMs to compete and…

Artificial Intelligence · Computer Science 2024-07-12 Oguzhan Topsakal , Colby Jacob Edell , Jackson Bailey Harper

Large Language Models (LLMs) are advanced Artificial Intelligence (AI) systems that have undergone extensive training using large datasets in order to understand and produce language that closely resembles that of humans. These models have…

Software Engineering · Computer Science 2023-08-10 Alessio Buscemi

Large Language Models (LLMs) are powerful tools, capable of leveraging their training on natural language to write stories, generate code, and answer questions. But can they generate functional video game levels? Game levels, with their…

Artificial Intelligence · Computer Science 2023-06-02 Graham Todd , Sam Earle , Muhammad Umair Nasir , Michael Cerny Green , Julian Togelius

Large Language Models (LLMs) have shown great ability in generating executable code from natural language, opening the possibility of automatically constructing environments for AI agents. Recent work on Code World Models (CWMs)…

Artificial Intelligence · Computer Science 2026-05-26 Tyrone Serapio , Arjun Prakash , Haoyang Xu , Kevin Wang , Amy Greenwald

Large Language Models' (LLMs) programming capabilities enable their participation in open-source games: a game-theoretic setting in which players submit computer programs in lieu of actions. These programs offer numerous advantages,…

Computer Science and Game Theory · Computer Science 2025-12-02 Swadesh Sistla , Max Kleiman-Weiner

Large Language Models (LLMs) have demonstrated great promise in generating code, especially when used inside an evolutionary computation framework to iteratively optimize the generated algorithms. However, in some cases they fail to…

Neural and Evolutionary Computing · Computer Science 2025-03-24 Niki van Stein , Anna V. Kononova , Lars Kotthoff , Thomas Bäck

Recent Language Models (LMs) achieve breakthrough performance in code generation when trained on human-authored problems, even solving some competitive-programming problems. Self-play has proven useful in games such as Go, and thus it is…

Machine Learning · Computer Science 2023-04-13 Patrick Haluptzok , Matthew Bowers , Adam Tauman Kalai

Large language models (LLMs) are used in software development to assist in various tasks, e.g., code generation and code completion, but empirical evaluations of the quality of the results produced by these models focus on correctness and…

Software Engineering · Computer Science 2025-02-05 Lola Solovyeva , Sophie Weidmann , Fernando Castor

Large Language Models (LLMs) have demonstrated their remarkable capabilities in numerous fields. This survey focuses on how LLMs empower users, regardless of their technical background, to use human languages to automatically generate…

Software Engineering · Computer Science 2025-04-03 Nam Huynh , Beiyu Lin

This paper examines the reasoning capabilities of Large Language Models (LLMs) from a novel perspective, focusing on their ability to operate within formally specified, rule-governed environments. We evaluate four LLMs (Gemini 2.5 Pro and…

Artificial Intelligence · Computer Science 2026-02-24 Maciej Świechowski , Adam Żychowski , Jacek Mańdziuk

Implementing board games in code can be a time-consuming task. However, Large Language Models (LLMs) have been proven effective at generating code for domain-specific tasks with simple contextual information. We aim to investigate whether…

While artificial intelligence (AI) technology is becoming increasingly popular, its underlying mechanisms tend to remain opaque to most people. To address this gap, the field of AI literacy aims to develop various resources to teach people…

Computers and Society · Computer Science 2026-03-31 Allison Chen , Isabella Pu

Creating programs to represent board games can be a time-consuming task. Large Language Models (LLMs) arise as appealing tools to expedite this process, given their capacity to efficiently generate code from simple contextual information.…

Machine Learning · Computer Science 2025-11-10 Álvaro Guglielmin Becker , Lana Bertoldo Rossato , Anderson Rocha Tavares

Recently, the emergence of large language models (LLMs) has unlocked new opportunities for procedural content generation. However, recent attempts mainly focus on level generation for specific games with defined game rules such as Super…

Artificial Intelligence · Computer Science 2024-05-31 Chengpeng Hu , Yunlong Zhao , Jialin Liu

Large language models (LLMs), such as ChatGPT and Copilot, are transforming software development by automating code generation and, arguably, enable rapid prototyping, support education, and boost productivity. Therefore, correctness and…

Software Engineering · Computer Science 2024-08-30 Robin Beer , Alexander Feix , Tim Guttzeit , Tamara Muras , Vincent Müller , Maurice Rauscher , Florian Schäffler , Welf Löwe
‹ Prev 1 2 3 10 Next ›