Related papers: GA-VisAgent: A Multi-Agent application for code ge…

VISTA: Visual Integrated System for Tailored Automation in Math Problem Generation Using LLM

Generating accurate and consistent visual aids is a critical challenge in mathematics education, where visual representations like geometric shapes and functions play a pivotal role in enhancing student comprehension. This paper introduces…

Computation and Language · Computer Science 2024-11-11 Jeongwoo Lee , Kwangsuk Park , Jihyeon Park

An LLM Agent for Automatic Geospatial Data Analysis

Large language models (LLMs) are being used in data science code generation tasks, but they often struggle with complex sequential tasks, leading to logical errors. Their application to geospatial data processing is particularly challenging…

Computers and Society · Computer Science 2024-10-28 Yuxing Chen , Weijie Wang , Sylvain Lobry , Camille Kurtz

Introducing Geometric Algebra to Geometric Computing Software Developers: A Computational Thinking Approach

Designing software systems for Geometric Computing applications can be a challenging task. Software engineers typically use software abstractions to hide and manage the high complexity of such systems. Without the presence of a unifying…

Mathematical Software · Computer Science 2017-05-19 Ahmad Hosny Eid

Optimized Automatic Code Generation for Geometric Algebra Based Algorithms with Ray Tracing Application

Automatic code generation for low-dimensional geometric algorithms is capable of producing efficient low-level software code through a high-level geometric domain specific language. Geometric Algebra (GA) is one of the most suitable…

Mathematical Software · Computer Science 2016-07-19 Ahmad Hosney Awad Eid

GUICourse: From General Vision Language Models to Versatile GUI Agents

Utilizing Graphic User Interface (GUI) for human-computer interaction is essential for accessing a wide range of digital tools. Recent advancements in Vision Language Models (VLMs) highlight the compelling potential to develop versatile…

Artificial Intelligence · Computer Science 2025-06-02 Wentong Chen , Junbo Cui , Jinyi Hu , Yujia Qin , Junjie Fang , Yue Zhao , Chongyi Wang , Jun Liu , Guirong Chen , Yupeng Huo , Yuan Yao , Yankai Lin , Zhiyuan Liu , Maosong Sun

CogAgent: A Visual Language Model for GUI Agents

People are spending an enormous amount of time on digital devices through graphical user interfaces (GUIs), e.g., computer or smartphone screens. Large language models (LLMs) such as ChatGPT can assist people in tasks like writing emails,…

Computer Vision and Pattern Recognition · Computer Science 2024-12-30 Wenyi Hong , Weihan Wang , Qingsong Lv , Jiazheng Xu , Wenmeng Yu , Junhui Ji , Yan Wang , Zihan Wang , Yuxuan Zhang , Juanzi Li , Bin Xu , Yuxiao Dong , Ming Ding , Jie Tang

Symbolic and User-friendly Geometric Algebra Routines (SUGAR) for Computations in Matlab

Geometric algebra (GA) is a mathematical tool for geometric computing, providing a framework that allows a unified and compact approach to geometric relations which in other mathematical systems are typically described using different more…

Mathematical Software · Computer Science 2025-05-09 Manel Velasco , Isiah Zaplana , Arnau Dória-Cerezo , Pau Martí

GeoCoder: Solving Geometry Problems by Generating Modular Code through Vision-Language Models

Geometry problem-solving demands advanced reasoning abilities to process multimodal inputs and employ mathematical knowledge effectively. Vision-language models (VLMs) have made significant progress in various multimodal tasks. Yet, they…

Computation and Language · Computer Science 2024-10-18 Aditya Sharma , Aman Dalmia , Mehran Kazemi , Amal Zouaq , Christopher J. Pal

CoAct-1: Computer-using Multi-Agent System with Coding Actions

Autonomous agents that operate computers via Graphical User Interfaces (GUIs) often struggle with efficiency and reliability on complex, long-horizon tasks. While augmenting these agents with planners can improve task decomposition, they…

Computation and Language · Computer Science 2026-02-23 Linxin Song , Yutong Dai , Viraj Prabhu , Jieyu Zhang , Taiwei Shi , Li Li , Junnan Li , Silvio Savarese , Zeyuan Chen , Jieyu Zhao , Ran Xu , Caiming Xiong

One algebra for all : Geometric Algebra methods for neurosymbolic XR scene authoring, animation and neural rendering

This position paper delves into the transformative role of Geometric Algebra (GA) in advancing specific areas of Computer Graphics (CG) and Extended Reality (XR), particularly in character animation, rendering, rigging, neural rendering,…

Graphics · Computer Science 2025-11-20 Manos Kamarianakis , Antonis Protopsaltis , George Papagiannakis

Knowledge-Guided Multi-Agent Framework for Application-Level Software Code Generation

Automated code generation driven by Large Lan- guage Models (LLMs) has enhanced development efficiency, yet generating complex application-level software code remains challenging. Multi-agent frameworks show potential, but existing methods…

Software Engineering · Computer Science 2025-10-24 Qian Xiong , Bo Yang , Weisong Sun , Yiran Zhang , Tianlin Li , Yang Liu , Zhi Jin

SAG-Agent: Enabling Long-Horizon Reasoning in Strategy Games via Dynamic Knowledge Graphs

Most commodity software lacks accessible Application Programming Interfaces (APIs), requiring autonomous agents to interact solely through pixel-based Graphical User Interfaces (GUIs). In this API-free setting, large language model…

Artificial Intelligence · Computer Science 2026-03-26 Chenwei Tang , Lin Long , Xinyu Liu , Jingyu Xing , Zizhou Wang , Joey Tianyi Zhou , Jiawei Du , Liangli Zhen , Jiancheng Lv

Geo-Code: A Code Framework for Reverse Code Generation from Geometric Images Based on Two-Stage Multi-Agent Evolution

Program code serves as a bridge linking vision and logic, providing a feasible supervisory approach for enhancing the multimodal reasoning capability of large models through geometric operations such as auxiliary line construction and…

Artificial Intelligence · Computer Science 2026-02-10 Zhenyu Wu , Yanxi Long , Jian Li , Hua Huang

PG-Agent: An Agent Powered by Page Graph

Graphical User Interface (GUI) agents possess significant commercial and social value, and GUI agents powered by advanced multimodal large language models (MLLMs) have demonstrated remarkable potential. Currently, existing GUI agents…

Artificial Intelligence · Computer Science 2025-09-05 Weizhi Chen , Ziwei Wang , Leyang Yang , Sheng Zhou , Xiaoxuan Tang , Jiajun Bu , Yong Li , Wei Jiang

Visual Graph Arena: Evaluating Visual Conceptualization of Vision and Multimodal Large Language Models

Recent advancements in multimodal large language models have driven breakthroughs in visual question answering. Yet, a critical gap persists, `conceptualization'-the ability to recognize and reason about the same concept despite variations…

Computer Vision and Pattern Recognition · Computer Science 2025-06-09 Zahra Babaiee , Peyman M. Kiasari , Daniela Rus , Radu Grosu

Geometrically-Constrained Agent for Spatial Reasoning

Vision Language Models (VLMs) exhibit a fundamental semantic-to-geometric gap in spatial reasoning: they excel at qualitative semantic inference but their reasoning operates within a lossy semantic space, misaligned with high-fidelity…

Artificial Intelligence · Computer Science 2025-12-01 Zeren Chen , Xiaoya Lu , Zhijie Zheng , Pengrui Li , Lehan He , Yijin Zhou , Jing Shao , Bohan Zhuang , Lu Sheng

AGACCI : Affiliated Grading Agents for Criteria-Centric Interface in Educational Coding Contexts

Recent advances in AI-assisted education have encouraged the integration of vision-language models (VLMs) into academic assessment, particularly for tasks that require both quantitative and qualitative evaluation. However, existing VLM…

Computers and Society · Computer Science 2025-07-09 Kwangsuk Park , Jiwoong Yang

GaGA: Towards Interactive Global Geolocation Assistant

Global geolocation, which seeks to predict the geographical location of images captured anywhere in the world, is one of the most challenging tasks in the field of computer vision. In this paper, we introduce an innovative interactive…

Computer Vision and Pattern Recognition · Computer Science 2025-04-21 Zhiyang Dou , Zipeng Wang , Xumeng Han , Guorong Li , Zhipei Huang , Zhenjun Han

GA-Unity: A Production-Ready Unity Package for Seamless Integration of Geometric Algebra in Networked Collaborative Applications

This paper introduces GA-Unity, the first Unity package specifically designed for seamless integration of Geometric Algebra (GA) into collaborative networked applications. Indeed, in such contexts, it has been demonstrated that using…

Graphics · Computer Science 2024-06-25 Manos Kamarianakis , Nick Lydatakis , George Papagiannakis

SymAgent: A Neural-Symbolic Self-Learning Agent Framework for Complex Reasoning over Knowledge Graphs

Recent advancements have highlighted that Large Language Models (LLMs) are prone to hallucinations when solving complex reasoning problems, leading to erroneous results. To tackle this issue, researchers incorporate Knowledge Graphs (KGs)…

Artificial Intelligence · Computer Science 2025-02-19 Ben Liu , Jihai Zhang , Fangquan Lin , Cheng Yang , Min Peng , Wotao Yin