Related papers: Evaluating GPT's Programming Capability through Co…

AI-assisted coding: Experiments with GPT-4

Artificial intelligence (AI) tools based on large language models have acheived human-level performance on some computer programming tasks. We report several experiments using GPT-4 to generate computer code. These experiments demonstrate…

Artificial Intelligence · Computer Science 2023-04-27 Russell A Poldrack , Thomas Lu , Gašper Beguš

Generative AI for Programming Education: Benchmarking ChatGPT, GPT-4, and Human Tutors

Generative AI and large language models hold great promise in enhancing computing education by powering next-generation educational technologies for introductory programming. Recent works have studied these models for different scenarios…

Computers and Society · Computer Science 2023-08-02 Tung Phung , Victor-Alexandru Pădurean , José Cambronero , Sumit Gulwani , Tobias Kohn , Rupak Majumdar , Adish Singla , Gustavo Soares

Kattis vs. ChatGPT: Assessment and Evaluation of Programming Tasks in the Age of Artificial Intelligence

AI-powered education technologies can support students and teachers in computer science education. However, with the recent developments in generative AI, and especially the increasingly emerging popularity of ChatGPT, the effectiveness of…

Artificial Intelligence · Computer Science 2023-12-05 Nora Dunder , Saga Lundborg , Olga Viberg , Jacqueline Wong

Can Generative Pre-trained Transformers (GPT) Pass Assessments in Higher Education Programming Courses?

We evaluated the capability of generative pre-trained transformers (GPT), to pass assessments in introductory and intermediate Python programming courses at the postsecondary level. Discussions of potential uses (e.g., exercise generation,…

Artificial Intelligence · Computer Science 2023-10-11 Jaromir Savelka , Arav Agarwal , Christopher Bogart , Yifan Song , Majd Sakr

Evaluating ChatGPT and GPT-4 for Visual Programming

Generative AI and large language models have the potential to drastically improve the landscape of computing education by automatically generating personalized feedback and content. Recent works have studied the capabilities of these models…

Machine Learning · Computer Science 2023-08-08 Adish Singla

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Generative Pre-trained Transformer (GPT) models have exhibited exciting progress in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the literature on the trustworthiness of GPT models remains…

Computation and Language · Computer Science 2024-02-28 Boxin Wang , Weixin Chen , Hengzhi Pei , Chulin Xie , Mintong Kang , Chenhui Zhang , Chejian Xu , Zidi Xiong , Ritik Dutta , Rylan Schaeffer , Sang T. Truong , Simran Arora , Mantas Mazeika , Dan Hendrycks , Zinan Lin , Yu Cheng , Sanmi Koyejo , Dawn Song , Bo Li

From GPT-3 to GPT-4: On the Evolving Efficacy of LLMs to Answer Multiple-choice Questions for Programming Classes in Higher Education

We explore the evolving efficacy of three generative pre-trained transformer (GPT) models in generating answers for multiple-choice questions (MCQ) from introductory and intermediate Python programming courses in higher education. We focus…

Computers and Society · Computer Science 2023-11-17 Jaromir Savelka , Arav Agarwal , Christopher Bogart , Majd Sakr

On the Reliability and Explainability of Language Models for Program Generation

Recent studies have adopted pre-trained language models, such as CodeT5 and CodeGPT, for automated program generation tasks like code generation, repair, and translation. Numerous language model-based approaches have been proposed and…

Software Engineering · Computer Science 2024-01-09 Yue Liu , Chakkrit Tantithamthavorn , Yonghui Liu , Li Li

Sparks of Artificial General Intelligence: Early experiments with GPT-4

Artificial intelligence (AI) researchers have been developing and refining large language models (LLMs) that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. The…

Computation and Language · Computer Science 2023-04-17 Sébastien Bubeck , Varun Chandrasekaran , Ronen Eldan , Johannes Gehrke , Eric Horvitz , Ece Kamar , Peter Lee , Yin Tat Lee , Yuanzhi Li , Scott Lundberg , Harsha Nori , Hamid Palangi , Marco Tulio Ribeiro , Yi Zhang

Large Language Models (GPT) Struggle to Answer Multiple-Choice Questions about Code

We analyzed effectiveness of three generative pre-trained transformer (GPT) models in answering multiple-choice question (MCQ) assessments, often involving short snippets of code, from introductory and intermediate programming courses at…

Computation and Language · Computer Science 2023-03-15 Jaromir Savelka , Arav Agarwal , Christopher Bogart , Majd Sakr

Gpt-4: A Review on Advancements and Opportunities in Natural Language Processing

Generative Pre-trained Transformer 4 (GPT-4) is the fourth-generation language model in the GPT series, developed by OpenAI, which promises significant advancements in the field of natural language processing (NLP). In this research…

Computation and Language · Computer Science 2023-05-08 Jawid Ahmad Baktash , Mursal Dawodi

Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models

Generative pre-trained transformer (GPT) models have revolutionized the field of natural language processing (NLP) with remarkable performance in various tasks and also extend their power to multimodal domains. Despite their success, large…

Computation and Language · Computer Science 2023-08-29 Kaiyuan Gao , Sunan He , Zhenyu He , Jiacheng Lin , QiZhi Pei , Jie Shao , Wei Zhang

A case study on the transformative potential of AI in software engineering on LeetCode and ChatGPT

The recent surge in the field of generative artificial intelligence (GenAI) has the potential to bring about transformative changes across a range of sectors, including software engineering and education. As GenAI tools, such as OpenAI's…

Databases · Computer Science 2025-01-08 Manuel Merkel , Jens Dörpinghaus

Beyond Generating Code: Evaluating GPT on a Data Visualization Course

This paper presents an empirical evaluation of the performance of the Generative Pre-trained Transformer (GPT) model in Harvard's CS171 data visualization course. While previous studies have focused on GPT's ability to generate code for…

Human-Computer Interaction · Computer Science 2024-05-14 Chen Zhu-Tian , Chenyang Zhang , Qianwen Wang , Jakob Troidl , Simon Warchol , Johanna Beyer , Nils Gehlenborg , Hanspeter Pfister

A comparison of Human, GPT-3.5, and GPT-4 Performance in a University-Level Coding Course

This study evaluates the performance of ChatGPT variants, GPT-3.5 and GPT-4, both with and without prompt engineering, against solely student work and a mixed category containing both student and GPT-4 contributions in university-level…

Computation and Language · Computer Science 2024-10-08 Will Yeadon , Alex Peach , Craig P. Testrow

Thrilled by Your Progress! Large Language Models (GPT-4) No Longer Struggle to Pass Assessments in Higher Education Programming Courses

This paper studies recent developments in large language models' (LLM) abilities to pass assessments in introductory and intermediate Python programming courses at the postsecondary level. The emergence of ChatGPT resulted in heated debates…

Computers and Society · Computer Science 2023-10-05 Jaromir Savelka , Arav Agarwal , Marshall An , Chris Bogart , Majd Sakr

QualiGPT: GPT as an easy-to-use tool for qualitative coding

Qualitative research delves deeply into individual complex perspectives on technology and various phenomena. However, a meticulous analysis of qualitative data often requires a significant amount of time, especially during the crucial…

Human-Computer Interaction · Computer Science 2023-10-12 He Zhang , Chuhao Wu , Jingyi Xie , ChanMin Kim , John M. Carroll

A Symbolic Framework for Evaluating Mathematical Reasoning and Generalisation with Transformers

This paper proposes a methodology for generating and perturbing detailed derivations of equations at scale, aided by a symbolic engine, to evaluate the generalisability of Transformers to out-of-distribution mathematical reasoning problems.…

Computation and Language · Computer Science 2024-04-09 Jordan Meadows , Marco Valentino , Damien Teney , Andre Freitas

Evaluating AI-generated code for C++, Fortran, Go, Java, Julia, Matlab, Python, R, and Rust

This study evaluates the capabilities of ChatGPT versions 3.5 and 4 in generating code across a diverse range of programming languages. Our objective is to assess the effectiveness of these AI models for generating scientific programs. To…

Software Engineering · Computer Science 2025-05-13 Patrick Diehl , Noujoud Nader , Steve Brandt , Hartmut Kaiser

Leveraging Generative AI for Extracting Process Models from Multimodal Documents

This paper presents an investigation of the capabilities of Generative Pre-trained Transformers (GPTs) to auto-generate graphical process models from multi-modal (i.e., text- and image-based) inputs. More precisely, we first introduce a…

Software Engineering · Computer Science 2024-06-10 Marvin Voelter , Raheleh Hadian , Timotheus Kampik , Marius Breitmayer , Manfred Reichert