Related papers: MVDLite: a Fast Validation Algorithm for Model Vie…

VLMine: Long-Tail Data Mining with Vision Language Models

Ensuring robust performance on long-tail examples is an important problem for many real-world applications of machine learning, such as autonomous driving. This work focuses on the problem of identifying rare examples within a corpus of…

Computer Vision and Pattern Recognition · Computer Science 2024-09-25 Mao Ye , Gregory P. Meyer , Zaiwei Zhang , Dennis Park , Siva Karthik Mustikovela , Yuning Chai , Eric M Wolff

VDMS: Efficient Big-Visual-Data Access for Machine Learning Workloads

We introduce the Visual Data Management System (VDMS), which enables faster access to big-visual-data and adds support to visual analytics. This is achieved by searching for relevant visual data via metadata stored as a graph, and enabling…

Databases · Computer Science 2018-12-12 Luis Remis , Vishakha Gupta-Cledat , Christina Strong , Ragaad Altarawneh

A Set of Rules for Model Validation

The validation of a data-driven model is the process of assessing the model's ability to generalize to new, unseen data in the population of interest. This paper proposes a set of general rules for model validation. These rules are designed…

Methodology · Statistics 2026-01-30 José Camacho

Visual Language Model as a Judge for Object Detection in Industrial Diagrams

Industrial diagrams such as piping and instrumentation diagrams (P&IDs) are essential for the design, operation, and maintenance of industrial plants. Converting these diagrams into digital form is an important step toward building digital…

Computer Vision and Pattern Recognition · Computer Science 2025-10-07 Sanjukta Ghosh

Enhanced XML Validation using SRML

Data validation is becoming more and more important with the ever-growing amount of data being consumed and transmitted by systems over the Internet. It is important to ensure that the data being sent is valid as it may contain entry…

Databases · Computer Science 2013-11-19 Miklos Kalman , Ferenc Havasi

Visualizing Regulation in Rule-based Models

Rule-based modeling is a powerful way to model kinetic interactions in biochemical systems. Rules enable a precise encoding of biochemical interactions at the resolution of sites within molecules, but obtaining an integrated global view…

Quantitative Methods · Quantitative Biology 2015-09-04 John A. P. Sekar , Jose-Juan Tapia , James R. Faeder

RMDL: Random Multimodel Deep Learning for Classification

The continually increasing number of complex datasets each year necessitates ever improving machine learning methods for robust and accurate categorization of these data. This paper introduces Random Multimodel Deep Learning (RMDL): a new…

Machine Learning · Computer Science 2018-06-01 Kamran Kowsari , Mojtaba Heidarysafa , Donald E. Brown , Kiana Jafari Meimandi , Laura E. Barnes

Interpreting Deep Learning Model Using Rule-based Method

Deep learning models are favored in many research and industry areas and have reached the accuracy of approximating or even surpassing human level. However they've long been considered by researchers as black-box models for their…

Machine Learning · Computer Science 2020-10-16 Xiaojian Wang , Jingyuan Wang , Ke Tang

Zelda: Video Analytics using Vision-Language Models

Advances in ML have motivated the design of video analytics systems that allow for structured queries over video datasets. However, existing systems limit query expressivity, require users to specify an ML model per predicate, rely on…

Databases · Computer Science 2023-11-09 Francisco Romero , Caleb Winston , Johann Hauswald , Matei Zaharia , Christos Kozyrakis

Trust the Model: Compact VLMs as In-Context Judges for Image-Text Data Quality

Vision-language models (VLMs) extend the conventional large language models by integrating visual data, enabling richer multimodal reasoning and significantly broadens the practical applications of AI. However, including visual inputs also…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 Daulet Toibazar , Kesen Wang , Sherif Mohamed , Abdulaziz Al-Badawi , Abdulrahman Alfulayt , Pedro J. Moreno

Multidimensional Scaling on Multiple Input Distance Matrices

Multidimensional Scaling (MDS) is a classic technique that seeks vectorial representations for data points, given the pairwise distances between them. However, in recent years, data are usually collected from diverse sources or have…

Computer Vision and Pattern Recognition · Computer Science 2017-08-29 Song Bai , Xiang Bai , Longin Jan Latecki , Qi Tian

Probabilistic Databases with MarkoViews

Most of the work on query evaluation in probabilistic databases has focused on the simple tuple-independent data model, where tuples are independent random events. Several efficient query evaluation techniques exists in this setting, such…

Databases · Computer Science 2012-08-02 Abhay Jha , Dan Suciu

Pareto Data Framework: Steps Towards Resource-Efficient Decision Making Using Minimum Viable Data (MVD)

This paper introduces the Pareto Data Framework, an approach for identifying and selecting the Minimum Viable Data (MVD) required for enabling machine learning applications on constrained platforms such as embedded systems, mobile devices,…

Machine Learning · Computer Science 2024-09-19 Tashfain Ahmed , Josh Siegel

STRIVE: Structured Representation Integrating VLM Reasoning for Efficient Object Navigation

Vision-Language Models (VLMs) have been increasingly integrated into object navigation tasks for their rich prior knowledge and strong reasoning abilities. However, applying VLMs to navigation poses two key challenges: effectively…

Robotics · Computer Science 2025-09-17 Haokun Zhu , Zongtai Li , Zhixuan Liu , Wenshan Wang , Ji Zhang , Jonathan Francis , Jean Oh

Model Input Verification of Large Scale Simulations

Reliable simulations are critical for analyzing and understanding complex systems, but their accuracy depends on correct input data. Incorrect inputs such as invalid or out-of-range values, missing data, and format inconsistencies can cause…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-10 Rumyana Neykova , Derek Groen

Frequent Query Matching in Dynamic Data Warehousing

With the need for flexible and on-demand decision support, Dynamic Data Warehouses (DDW) provide benefits over traditional data warehouses due to their dynamic characteristics in structuring and access mechanism. A DDW is a data framework…

Databases · Computer Science 2017-03-07 Charles H. Goonetilleke , J. Wenny Rahayu , Md. Saiful Islam

Boundary Blending: Reconsidering the Design of Multi-View Visualizations

Multiple-view visualizations (MVs) have been widely used for visual analysis. Each view shows some part of the data in a usable way, and together multiple views enable a holistic understanding of the data under investigation. For example,…

Human-Computer Interaction · Computer Science 2023-06-19 Maoyuan Sun , Abdul Rahman Shaikh , Yue Ma , David Koop , Hamed Alhoori

DepthLM: Metric Depth From Vision Language Models

Vision language models (VLMs) can flexibly address various vision tasks through text interactions. Although successful in semantic understanding, state-of-the-art VLMs including GPT-5 still struggle in understanding 3D from 2D inputs. On…

Computer Vision and Pattern Recognition · Computer Science 2025-10-02 Zhipeng Cai , Ching-Feng Yeh , Hu Xu , Zhuang Liu , Gregory Meyer , Xinjie Lei , Changsheng Zhao , Shang-Wen Li , Vikas Chandra , Yangyang Shi

MV-MLM: Bridging Multi-View Mammography and Language for Breast Cancer Diagnosis and Risk Prediction

Large annotated datasets are essential for training robust Computer-Aided Diagnosis (CAD) models for breast cancer detection or risk prediction. However, acquiring such datasets with fine-detailed annotation is both costly and…

Computer Vision and Pattern Recognition · Computer Science 2025-10-31 Shunjie-Fabian Zheng , Hyeonjun Lee , Thijs Kooi , Ali Diba

Visual Model Validation via Inline Replication

Data visualizations typically show retrospective views of an existing dataset with little or no focus on repeatability. However, consumers of these tools often use insights gleaned from retrospective visualizations as the basis for…

Human-Computer Interaction · Computer Science 2019-11-13 David Gotz , Brandon A. Price , Annie T. Chen