Related papers: A Visual Query Language for Complex-Value Database…

Chain-of-Procedure: Hierarchical Visual-Language Reasoning for Procedural QA

Recent advances in vision-language models (VLMs) have achieved impressive results on standard image-text tasks, yet their potential for visual procedure question answering (VP-QA) remains largely unexplored. VP-QA presents unique challenges…

Computation and Language · Computer Science 2026-05-15 Guanhua Chen , Yutong Yao , Shenghe Sun , Ci-Jun Gao , Shudong Liu , Lidia S. Chao , Feng Wan , Derek F. Wong

Structured Query Language for Virtual Observatory

Currently two query languages are defined as standards for the Virtual Observatory (VO). Astronomical Data Query Language (ADQL) is used for catalog data query and Simple Image Access Protocol (SIAP) is for image data query. As a result,…

Astrophysics · Physics 2007-05-23 Yuji Shirasaki , Masatoshi Ohishi , Yoshihiko Mizumoto , Masahiro Tanaka , Satoshi Honda , Masafumi Oe , Naoki Yasuda , Yoshifumi Masunaga

Principles of Query Visualization

Query Visualization (QV) is the problem of transforming a given query into a graphical representation that helps humans understand its meaning. This task is notably different from designing a Visual Query Language (VQL) that helps a user…

Databases · Computer Science 2022-08-03 Wolfgang Gatterbauer , Cody Dunne , H. V. Jagadish , Mirek Riedewald

V1: A Visual Query Language for Property Graphs

The property graph is an increasingly popular data model. Pattern construction and pattern matching are important tasks when dealing with property graphs. Given a property graph schema S, a property graph G, and a query pattern P, all…

Databases · Computer Science 2021-01-19 Lior Kogan

MV-CoRe: Multimodal Visual-Conceptual Reasoning for Complex Visual Question Answering

Complex Visual Question Answering (Complex VQA) tasks, which demand sophisticated multi-modal reasoning and external knowledge integration, present significant challenges for existing large vision-language models (LVLMs) often limited by…

Computer Vision and Pattern Recognition · Computer Science 2025-08-12 Jingwei Peng , Jiehao Chen , Mateo Alejandro Rojas , Meilin Zhang

A Tutorial on Visual Representations of Relational Queries

Query formulation is increasingly performed by systems that need to guess a user's intent (e.g. via spoken word interfaces). But how can a user know that the computational agent is returning answers to the "right" query? More generally,…

Databases · Computer Science 2023-08-22 Wolfgang Gatterbauer

Categorical Calculus and Algebra for Multi-Model Data

Multi-model databases are designed to store, manage, and query data in various models, such as relational, hierarchical, and graph data, simultaneously. In this paper, we provide a theoretical basis for querying categorical databases. We…

Databases · Computer Science 2026-03-12 Jiaheng Lu

DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue

Different from Visual Question Answering task that requires to answer only one question about an image, Visual Dialogue involves multiple questions which cover a broad range of visual content that could be related to any objects,…

Computer Vision and Pattern Recognition · Computer Science 2019-11-19 Xiaoze Jiang , Jing Yu , Zengchang Qin , Yingying Zhuang , Xingxing Zhang , Yue Hu , Qi Wu

VIQI: A New Approach for Visual Interpretation of Deep Web Query Interfaces

Deep Web databases contain more than 90% of pertinent information of the Web. Despite their importance, users don't profit of this treasury. Many deep web services are offering competitive services in term of prices, quality of service, and…

Information Retrieval · Computer Science 2012-05-07 Radhouane Boughamoura , Lobna Hlaoua , Mohamed Nazih Omri

An End-to-end Neural Natural Language Interface for Databases

The ability to extract insights from new data sets is critical for decision making. Visual interactive tools play an important role in data exploration since they provide non-technical users with an effective way to visually compose queries…

Databases · Computer Science 2018-04-03 Prasetya Utama , Nathaniel Weir , Fuat Basik , Carsten Binnig , Ugur Cetintemel , Benjamin Hättasch , Amir Ilkhechi , Shekar Ramaswamy , Arif Usta

A Picture May Be Worth a Hundred Words for Visual Question Answering

How far can we go with textual representations for understanding pictures? In image understanding, it is essential to use concise but detailed image representations. Deep visual features extracted by vision models, such as Faster R-CNN, are…

Computer Vision and Pattern Recognition · Computer Science 2021-06-28 Yusuke Hirota , Noa Garcia , Mayu Otani , Chenhui Chu , Yuta Nakashima , Ittetsu Taniguchi , Takao Onoye

Graph-Structured Representations for Visual Question Answering

This paper proposes to improve visual question answering (VQA) with structured representations of both scene contents and questions. A key challenge in VQA is to require joint reasoning over the visual and text domains. The predominant…

Computer Vision and Pattern Recognition · Computer Science 2017-03-31 Damien Teney , Lingqiao Liu , Anton van den Hengel

Visual Question Answering as Reading Comprehension

Visual question answering (VQA) demands simultaneous comprehension of both the image visual content and natural language questions. In some cases, the reasoning needs the help of common sense or general knowledge which usually appear in the…

Computer Vision and Pattern Recognition · Computer Science 2018-11-30 Hui Li , Peng Wang , Chunhua Shen , Anton van den Hengel

V-PROM: A Benchmark for Visual Reasoning Using Visual Progressive Matrices

One of the primary challenges faced by deep learning is the degree to which current methods exploit superficial statistics and dataset bias, rather than learning to generalise over the specific representations they have experienced. This is…

Computer Vision and Pattern Recognition · Computer Science 2019-07-30 Damien Teney , Peng Wang , Jiewei Cao , Lingqiao Liu , Chunhua Shen , Anton van den Hengel

Multi-modal Large Language Model Enhanced Pseudo 3D Perception Framework for Visual Commonsense Reasoning

The visual commonsense reasoning (VCR) task is to choose an answer and provide a justifying rationale based on the given image and textural question. Representative works first recognize objects in images and then associate them with key…

Computer Vision and Pattern Recognition · Computer Science 2023-12-27 Jian Zhu , Hanli Wang , Miaojing Shi

Beyond Embeddings: The Promise of Visual Table in Visual Reasoning

Visual representation learning has been a cornerstone in computer vision, involving typical forms such as visual embeddings, structural symbols, and text-based representations. Despite the success of CLIP-type visual embeddings, they often…

Computer Vision and Pattern Recognition · Computer Science 2024-06-18 Yiwu Zhong , Zi-Yuan Hu , Michael R. Lyu , Liwei Wang

Recursive Visual Programming

Visual Programming (VP) has emerged as a powerful framework for Visual Question Answering (VQA). By generating and executing bespoke code for each question, these methods demonstrate impressive compositional and reasoning capabilities,…

Computer Vision and Pattern Recognition · Computer Science 2024-07-11 Jiaxin Ge , Sanjay Subramanian , Baifeng Shi , Roei Herzig , Trevor Darrell

AskYourDB: An end-to-end system for querying and visualizing relational databases using natural language

Querying databases for the right information is a time consuming and error-prone task and often requires experienced professionals for the job. Furthermore, the user needs to have some prior knowledge about the database. There have been…

Databases · Computer Science 2022-10-18 Manu Joseph , Harsh Raj , Anubhav Yadav , Aaryamann Sharma

Semantic Parsing for Complex Data Retrieval: Targeting Query Plans vs. SQL for No-Code Access to Relational Databases

Large Language Models (LLMs) have spurred progress in text-to-SQL, the task of generating SQL queries from natural language questions based on a given database schema. Despite the declarative nature of SQL, it continues to be a complex…

Computation and Language · Computer Science 2023-12-25 Ben Eyal , Amir Bachar , Ophir Haroche , Michael Elhadad

Semantic-Clipping: Efficient Vision-Language Modeling with Semantic-Guidedd Visual Selection

Vision-Language Models (VLMs) leverage aligned visual encoders to transform images into visual tokens, allowing them to be processed similarly to text by the backbone large language model (LLM). This unified input paradigm enables VLMs to…

Computer Vision and Pattern Recognition · Computer Science 2025-03-18 Bangzheng Li , Fei Wang , Wenxuan Zhou , Nan Xu , Ben Zhou , Sheng Zhang , Hoifung Poon , Muhao Chen