English
Related papers

Related papers: Benchmarking Generative Models on Computational Th…

200 papers

Generative AI and large language models have the potential to drastically improve the landscape of computing education by automatically generating personalized feedback and content. Recent works have studied the capabilities of these models…

Machine Learning · Computer Science 2023-08-08 Adish Singla

Large language and multimodal models have shown remarkable success on various benchmarks focused on specific skills such as general-purpose programming, math word problem-solving, and visual question answering. However, it is unclear how…

Artificial Intelligence · Computer Science 2025-10-07 Chao Wen , Jacqueline Staub , Adish Singla

The rise of Generative AI (GenAI) tools like ChatGPT has created new opportunities and challenges for computing education. Existing research has primarily focused on GenAI's ability to complete educational tasks and its impact on student…

Software Engineering · Computer Science 2025-11-18 Rufeng Chen , Shuaishuai Jiang , Jiyun Shen , AJung Moon , Lili Wei

Generative AI and large language models hold great promise in enhancing computing education by powering next-generation educational technologies for introductory programming. Recent works have studied these models for different scenarios…

Generative modeling has recently shown great promise in computer vision, but it has mostly focused on synthesizing visually realistic images. In this paper, motivated by multi-task learning of shareable feature representations, we consider…

Computer Vision and Pattern Recognition · Computer Science 2021-06-28 Zhipeng Bao , Martial Hebert , Yu-Xiong Wang

The use of Large Language Models (LLMs) in mathematical reasoning has become a cornerstone of related research, demonstrating the intelligence of these models and enabling potential practical applications through their advanced performance,…

Computation and Language · Computer Science 2024-12-20 Kathrin Seßler , Yao Rong , Emek Gözlüklü , Enkelejda Kasneci

To advance the mathematical proficiency of large language models (LLMs), the DeepMath team has launched an open-source initiative aimed at developing an open mathematical LLM and systematically evaluating its mathematical creativity. This…

We introduce a new challenge to test the STEM skills of neural models. The problems in the real world often require solutions, combining knowledge from STEM (science, technology, engineering, and math). Unlike existing datasets, our dataset…

Computation and Language · Computer Science 2024-05-24 Jianhao Shen , Ye Yuan , Srbuhi Mirzoyan , Ming Zhang , Chenguang Wang

Generative neural models hold great promise in enhancing programming education by synthesizing new content. We seek to design neural models that can automatically generate programming tasks for a given specification in the context of visual…

Machine Learning · Computer Science 2024-01-17 Victor-Alexandru Pădurean , Georgios Tzannetos , Adish Singla

Large language models (LLMs) and prompt engineering hold significant potential for advancing computer programming education through personalized instruction. This paper explores this potential by investigating three critical research…

Artificial Intelligence · Computer Science 2024-07-09 Tianyu Wang , Nianjun Zhou , Zhixiong Chen

Generative AI and large language models hold great promise in enhancing programming education by generating individualized feedback and hints for learners. Recent works have primarily focused on improving the quality of generated feedback…

Machine Learning · Computer Science 2025-03-10 Nachiket Kotalwar , Alkis Gotovos , Adish Singla

CG (Computer Graphics) is a popular field of CS (Computer Science), but many students find this topic difficult due to it requiring a large number of skills, such as mathematics, programming, geometric reasoning, and creativity. Over the…

Artificial Intelligence · Computer Science 2024-10-23 Tony Haoran Feng , Paul Denny , Burkhard C. Wünsche , Andrew Luxton-Reilly , Jacqueline Whalley

This paper proposes a methodology for generating and perturbing detailed derivations of equations at scale, aided by a symbolic engine, to evaluate the generalisability of Transformers to out-of-distribution mathematical reasoning problems.…

Computation and Language · Computer Science 2024-04-09 Jordan Meadows , Marco Valentino , Damien Teney , Andre Freitas

Large Language Models (LLMs) have demonstrated impressive capabilities in natural language and code generation, and are increasingly deployed as automatic judges of model outputs and learning activities. Yet, their behavior on structured…

Computation and Language · Computer Science 2025-11-25 H. M. Shadman Tabib , Jaber Ahmed Deedar

Access to high-quality education at scale is limited by the difficulty of providing student feedback on open-ended assignments in structured domains like computer programming, graphics, and short response questions. This problem has proven…

Machine Learning · Computer Science 2021-03-25 Ali Malik , Mike Wu , Vrinda Vasavada , Jinpeng Song , Madison Coots , John Mitchell , Noah Goodman , Chris Piech

Generative models have received a lot of attention in many areas of academia and the industry. Their capabilities span many areas, from the invention of images given a prompt to the generation of concrete code to solve a certain programming…

Human-Computer Interaction · Computer Science 2024-03-12 Pere-Pau Vázquez

Vision-Language Models (VLMs) have demonstrated impressive world knowledge across a wide range of tasks, making them promising candidates for embodied reasoning applications. However, existing benchmarks primarily evaluate the embodied…

Computer Vision and Pattern Recognition · Computer Science 2025-10-01 Haotian Xue , Yunhao Ge , Yu Zeng , Zhaoshuo Li , Ming-Yu Liu , Yongxin Chen , Jiaojiao Fan

This paper investigates visual analogical reasoning in large multimodal models (LMMs) compared to human adults and children. A "visual analogy" is an abstract rule inferred from one image and applied to another. While benchmarks exist for…

Computer Vision and Pattern Recognition · Computer Science 2025-12-05 Eunice Yiu , Maan Qraitem , Anisa Noor Majhi , Charlie Wong , Yutong Bai , Shiry Ginosar , Alison Gopnik , Kate Saenko

Developing an educational test can be expensive and time-consuming, as each item must be written by experts and then evaluated by collecting hundreds of student responses. Moreover, many tests require multiple distinct sets of questions…

Computation and Language · Computer Science 2023-10-11 Eric Zelikman , Wanjing Anya Ma , Jasmine E. Tran , Diyi Yang , Jason D. Yeatman , Nick Haber

Large Language Models, such as Generative Pre-trained Transformer 3 (aka. GPT-3), have been developed to understand language through the analysis of extensive text data, allowing them to identify patterns and connections between words.…

Computation and Language · Computer Science 2023-10-03 Baphumelele Masikisiki , Vukosi Marivate , Yvette Hlope
‹ Prev 1 2 3 10 Next ›