English
Related papers

Related papers: Evaluating Large Language Models on Spatial Tasks:…

200 papers

Generative AI including large language models (LLMs) has recently gained significant interest in the geo-science community through its versatile task-solving capabilities including programming, arithmetic reasoning, generation of sample…

Computers and Society · Computer Science 2024-08-14 Hartwig H. Hochmair , Levente Juhasz , Takoda Kemp

Artificial intelligence (AI) has made remarkable progress across various domains, with large language models like ChatGPT gaining substantial attention for their human-like text-generation capabilities. Despite these achievements, spatial…

Artificial Intelligence · Computer Science 2024-01-11 Fangjun Li , David C. Hogg , Anthony G. Cohn

As large language models (LLMs) continue to advance, evaluating their comprehensive capabilities becomes significant for their application in various fields. This research study comprehensively evaluates the language, vision, speech, and…

This study quantifies how prompting strategies interact with large language models (LLMs) to automate the screening stage of systematic literature reviews (SLRs). We evaluate six LLMs (GPT-4o, GPT-4o-mini, DeepSeek-Chat-V3,…

Computation and Language · Computer Science 2025-10-21 Binglan Han , Anuradha Mathrani , Teo Susnjak

The rapid advancement of large language models, such as the Generative Pre-trained Transformer (GPT) series, has had significant implications across various disciplines. In this study, we investigate the potential of the state-of-the-art…

Computation and Language · Computer Science 2023-09-06 Yunhao Yang , Anshul Tomar

Large Language Models (LLMs) have shown impressive performance on a range of educational tasks, but are still understudied for their potential to solve mathematical problems. In this study, we compare three prominent LLMs, including GPT-4o,…

Artificial Intelligence · Computer Science 2025-07-01 Ruonan Wang , Runxi Wang , Yunwen Shen , Chengfeng Wu , Qinglin Zhou , Rohitash Chandra

Large language models (LLMs) have achieved remarkable success across a wide spectrum of tasks; however, they still face limitations in scenarios that demand long-term planning and spatial reasoning. To facilitate this line of research, in…

Computation and Language · Computer Science 2025-02-25 Mohamed Aghzal , Erion Plaku , Ziyu Yao

We assessed the performance of commercial Large Language Models (LLMs) GPT-3.5-Turbo and GPT-4 on tasks from the 2023 BioASQ challenge. In Task 11b Phase B, which is focused on answer generation, both models demonstrated competitive…

Computation and Language · Computer Science 2023-07-25 Samy Ateia , Udo Kruschwitz

Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires…

Large Language Models (LLMs) like ChatGPT, Copilot, Gemini, and DeepSeek are transforming software engineering by automating key tasks, including code generation, testing, and debugging. As these models become integral to development…

Software Engineering · Computer Science 2025-08-07 Everton Guimaraes , Nathalia Nascimento , Chandan Shivalingaiah , Asish Nelapati

This paper presents an in-depth analysis of the performance of seven different Large Language Models (LLMs) in solving a diverse set of math advanced calculus problems. The study aims to evaluate these models' accuracy, reliability, and…

Computation and Language · Computer Science 2025-03-07 In Hak Moon

OpenAI's ChatGPT (GPT-4 and GPT-4o) and other Large Language Models (LLMs) like Microsoft's Copilot, Google's Gemini 1.5 Pro, and Antrophic's Claude 3.5 Sonnet can be effectively used in various phases of scientific research. Their…

Artificial Intelligence · Computer Science 2024-09-24 Goran Bubaš

This study aims to explore the performance improvement method of large language models based on GPT-4 under the multi-task learning framework and conducts experiments on two tasks: text classification and automatic summary generation.…

Computation and Language · Computer Science 2024-12-10 Zhen Qi , Jiajing Chen , Shuo Wang , Bingying Liu , Hongye Zheng , Chihang Wang

Large-scale generative language models such as GPT-3 are competitive few-shot learners. While these models are known to be able to jointly represent many different languages, their training data is dominated by English, potentially limiting…

With increasing scale, large language models demonstrate both quantitative improvement and new qualitative capabilities, especially as zero-shot learners, like GPT-3. However, these results rely heavily on delicate prompt design and large…

Computation and Language · Computer Science 2022-12-21 Jingjing Xu , Qingxiu Dong , Hongyi Liu , Lei Li

Recently, the visual generation ability by GPT-4o(mni) has been unlocked by OpenAI. It demonstrates a very remarkable generation capability with excellent multimodal condition understanding and varied task instructions. In this paper, we…

Computer Vision and Pattern Recognition · Computer Science 2025-05-12 Pu Cao , Feng Zhou , Junyi Ji , Qingye Kong , Zhixiang Lv , Mingjian Zhang , Xuekun Zhao , Siqi Wu , Yinghui Lin , Qing Song , Lu Yang

This paper establishes a benchmark for evaluating tool-calling capabilities of large language models (LLMs) on multi-step geospatial tasks relevant to commercial GIS practitioners. We assess eight commercial LLMs (Claude Sonnet 3.5 and 4,…

Computation and Language · Computer Science 2025-10-23 Varvara Krechetova , Denis Kochedykov

ChatGPT is a large language model developed by OpenAI. Despite its impressive performance across various tasks, no prior work has investigated its capability in the biomedical domain yet. To this end, this paper aims to evaluate the…

Computation and Language · Computer Science 2023-08-25 Israt Jahan , Md Tahmid Rahman Laskar , Chun Peng , Jimmy Huang

Cybersecurity education is challenging and it is helpful for educators to understand Large Language Models' (LLMs') capabilities for supporting education. This study evaluates the effectiveness of LLMs in conducting a variety of penetration…

Cryptography and Security · Computer Science 2026-03-30 Martin Nizon-Deladoeuille , Brynjólfur Stefánsson , Helmut Neukirchen , Thomas Welsh

The use of large language models (LLMs) is expanding rapidly, and open-source versions are becoming available, offering users safer and more adaptable options. These models enable users to protect data privacy by eliminating the need to…

Machine Learning · Computer Science 2024-08-06 Hui Yin , Amir Aryani , Nakul Nambiar
‹ Prev 1 2 3 10 Next ›