Related papers: Benchmarking Language Model Creativity: A Case Stu…

CreativEval: Evaluating Creativity of LLM-Based Hardware Code Generation

Large Language Models (LLMs) have proved effective and efficient in generating code, leading to their utilization within the hardware design process. Prior works evaluating LLMs' abilities for register transfer level code generation solely…

Computation and Language · Computer Science 2024-04-16 Matthew DeLorenzo , Vasudev Gohil , Jeyavijayan Rajendran

CreativityPrism: A Holistic Evaluation Framework for Large Language Model Creativity

Creativity is often seen as a hallmark of human intelligence. While large language models (LLMs) are increasingly perceived as generating creative text, there is still no holistic and scalable framework to evaluate their creativity across…

Computation and Language · Computer Science 2026-02-19 Zhaoyi Joey Hou , Bowei Alvin Zhang , Yining Lu , Bhiman Kumar Baghel , Anneliese Brei , Ximing Lu , Meng Jiang , Faeze Brahman , Snigdha Chaturvedi , Haw-Shiuan Chang , Daniel Khashabi , Xiang Lorraine Li

Assessing and Understanding Creativity in Large Language Models

In the field of natural language processing, the rapid development of large language model (LLM) has attracted more and more attention. LLMs have shown a high level of creativity in various tasks, but the methods for assessing such…

Computation and Language · Computer Science 2025-02-21 Yunpu Zhao , Rui Zhang , Wenyi Li , Di Huang , Jiaming Guo , Shaohui Peng , Yifan Hao , Yuanbo Wen , Xing Hu , Zidong Du , Qi Guo , Ling Li , Yunji Chen

Exploring Creativity in Human-Human-LLM Collaborative Software Design

While the use of Large Language Models (LLMs) in programming has been extensively studied, there is limited understanding of how LLMs support collaborative work where creativity plays a central role. Software design, as a collaborative and…

Software Engineering · Computer Science 2026-04-28 Victoria Jackson , Grischa Liebel , Rafael Prikladnicki , Andre van der Hoek

Rethinking Creativity Evaluation: A Critical Analysis of Existing Creativity Evaluations

We examine, analyze, and compare four representative creativity measures--perplexity, LLM-as-a-Judge, the Creativity Index (CI; measuring n-gram overlap with web corpora), and syntactic templates (detecting repetition of common…

Computation and Language · Computer Science 2026-01-29 Li-Chun Lu , Miri Liu , Pin-Chun Lu , Yufei Tian , Shao-Hua Sun , Nanyun Peng

CREATE: Testing LLMs for Associative Creativity

A key component of creativity is associative reasoning: the ability to draw novel yet meaningful connections between concepts. We introduce CREATE, a benchmark designed to evaluate models' capacity for creative associative reasoning. CREATE…

Computation and Language · Computer Science 2026-05-12 Manya Wadhwa , Tiasa Singha Roy , Harvey Lederman , Junyi Jessy Li , Greg Durrett

Divergent Creativity in Humans and Large Language Models

The recent surge of Large Language Models (LLMs) has led to claims that they are approaching a level of creativity akin to human capabilities. This idea has sparked a blend of excitement and apprehension. However, a critical piece that has…

Computation and Language · Computer Science 2025-07-03 Antoine Bellemare-Pepin , François Lespinasse , Philipp Thölke , Yann Harel , Kory Mathewson , Jay A. Olson , Yoshua Bengio , Karim Jerbi

How Creative Are Large Language Models in Generating Molecules?

Molecule generation requires satisfying multiple chemical and biological constraints while searching a large and structured chemical space. This makes it a non-binary problem, where effective models must identify non-obvious solutions under…

Computation and Language · Computer Science 2026-04-21 Wen Tao , Yiwei Wang , Peng Zhou , Bryan Hooi , Wanlong Fang , Tianle Zhang , Xiao Luo , Yuansheng Liu , Alvin Chan

Creativity Benchmark: A benchmark for marketing creativity for large language models

We introduce Creativity Benchmark, an evaluation framework for large language models (LLMs) in marketing creativity. The benchmark covers 100 brands (12 categories) and three prompt types (Insights, Ideas, Wild Ideas). Human pairwise…

Computation and Language · Computer Science 2025-10-21 Ninad Bhat , Kieran Browne , Pip Bingemann

Evaluating Creative Short Story Generation in Humans and Large Language Models

Story-writing is a fundamental aspect of human imagination, relying heavily on creativity to produce narratives that are novel, effective, and surprising. While large language models (LLMs) have demonstrated the ability to generate…

Computation and Language · Computer Science 2025-05-13 Mete Ismayilzada , Claire Stevenson , Lonneke van der Plas

A Framework for Collaborating a Large Language Model Tool in Brainstorming for Triggering Creative Thoughts

Creativity involves not only generating new ideas from scratch but also redefining existing concepts and synthesizing previous insights. Among various techniques developed to foster creative thinking, brainstorming is widely used. With…

Human-Computer Interaction · Computer Science 2024-10-17 Hung-Fu Chang , Tong Li

Steering Large Language Models to Evaluate and Amplify Creativity

Although capable of generating creative text, Large Language Models (LLMs) are poor judges of what constitutes "creativity". In this work, we show that we can leverage this knowledge of how to write creatively in order to better judge what…

Computation and Language · Computer Science 2024-12-10 Matthew Lyle Olson , Neale Ratzlaff , Musashi Hinck , Shao-yen Tseng , Vasudev Lal

We're Different, We're the Same: Creative Homogeneity Across LLMs

Numerous powerful large language models (LLMs) are now available for use as writing support tools, idea generators, and beyond. Although these LLMs are marketed as helpful creative assistants, several works have shown that using an LLM as a…

Computers and Society · Computer Science 2025-02-03 Emily Wenger , Yoed Kenett

DeepMath-Creative: A Benchmark for Evaluating Mathematical Creativity of Large Language Models

To advance the mathematical proficiency of large language models (LLMs), the DeepMath team has launched an open-source initiative aimed at developing an open mathematical LLM and systematically evaluating its mathematical creativity. This…

Artificial Intelligence · Computer Science 2025-05-14 Xiaoyang Chen , Xinan Dai , Yu Du , Qian Feng , Naixu Guo , Tingshuo Gu , Yuting Gao , Yingyi Gao , Xudong Han , Xiang Jiang , Yilin Jin , Hongyi Lin , Shisheng Lin , Xiangnan Li , Yuante Li , Yixing Li , Zhentao Lai , Zilu Ma , Yingrong Peng , Jiacheng Qian , Hao-Yu Sun , Jianbo Sun , Zirui Wang , Siwei Wu , Zian Wang , Bin Xu , Jianghao Xu , Yiyang Yu , Zichuan Yang , Hongji Zha , Ruichong Zhang

Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation

Code generation aims to automatically generate source code from high-level task specifications, which can significantly increase productivity of software engineering. Recently, approaches based on large language models (LLMs) have shown…

Artificial Intelligence · Computer Science 2023-05-19 Xin-Ye Li , Jiang-Tian Xue , Zheng Xie , Ming Li

Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM

Creativity is a fundamental aspect of intelligence, involving the ability to generate novel and appropriate solutions across diverse contexts. While Large Language Models (LLMs) have been extensively evaluated for their creative…

Computer Vision and Pattern Recognition · Computer Science 2025-03-20 Xinyu Fang , Zhijian Chen , Kai Lan , Lixin Ma , Shengyuan Ding , Yingji Liang , Xiangyu Zhao , Farong Wen , Zicheng Zhang , Guofeng Zhang , Haodong Duan , Kai Chen , Dahua Lin

Evaluating LLMs' Divergent Thinking Capabilities for Scientific Idea Generation with Minimal Context

While Large Language Models (LLMs) demonstrate remarkable capabilities in scientific tasks such as literature analysis and experimental design (e.g., accurately extracting key findings from papers or generating coherent experimental…

Computation and Language · Computer Science 2026-02-24 Kai Ruan , Xuan Wang , Jixiang Hong , Peng Wang , Yang Liu , Hao Sun

Creativity or Brute Force? Using Brainteasers as a Window into the Problem-Solving Abilities of Large Language Models

Accuracy remains a standard metric for evaluating AI systems, but it offers limited insight into how models arrive at their solutions. In this work, we introduce a benchmark based on brainteasers written in long narrative form to probe more…

Artificial Intelligence · Computer Science 2025-10-30 Simeng Han , Howard Dai , Stephen Xia , Grant Zhang , Chen Liu , Lichang Chen , Hoang Huy Nguyen , Hongyuan Mei , Jiayuan Mao , R. Thomas McCoy

Characterising the Creative Process in Humans and Large Language Models

Large language models appear quite creative, often performing on par with the average human on creative tasks. However, research on LLM creativity has focused solely on \textit{products}, with little attention on the creative…

Human-Computer Interaction · Computer Science 2024-06-07 Surabhi S. Nath , Peter Dayan , Claire Stevenson

CresOWLve: Benchmarking Creative Problem-Solving Over Real-World Knowledge

Creative problem-solving requires combining multiple cognitive abilities, including logical reasoning, lateral thinking, analogy-making, and commonsense knowledge, to discover insights that connect seemingly unrelated pieces of information.…

Computation and Language · Computer Science 2026-04-07 Mete Ismayilzada , Renqing Cuomao , Daniil Yurshevich , Anna Sotnikova , Lonneke van der Plas , Antoine Bosselut