Related papers: CLIPPER: A Graph-Theoretic Framework for Robust Da…

CLIPPER: Robust Data Association without an Initial Guess

Identifying correspondences in noisy data is a critically important step in estimation processes. When an informative initial estimation guess is available, the data association challenge is less acute; however, the existence of a…

Robotics · Computer Science 2024-02-13 Parker C. Lusk , Jonathan P. How

CLIPPER+: A Fast Maximal Clique Algorithm for Robust Global Registration

We present CLIPPER+, an algorithm for finding maximal cliques in unweighted graphs for outlier-robust global registration. The registration problem can be formulated as a graph and solved by finding its maximum clique. This formulation…

Robotics · Computer Science 2024-02-26 Kaveh Fathian , Tyler Summers

CLIPPER: Compression enables long-context synthetic data generation

LLM developers are increasingly reliant on synthetic data, but generating high-quality data for complex long-context reasoning tasks remains challenging. We introduce CLIPPER, a compression-based approach for generating synthetic data…

Computation and Language · Computer Science 2025-08-06 Chau Minh Pham , Yapei Chang , Mohit Iyyer

ROBIN: a Graph-Theoretic Approach to Reject Outliers in Robust Estimation using Invariants

Many estimation problems in robotics, computer vision, and learning require estimating unknown quantities in the face of outliers. Outliers are typically the result of incorrect data association or feature matching, and it is common to have…

Computer Vision and Pattern Recognition · Computer Science 2021-03-25 Jingnan Shi , Heng Yang , Luca Carlone

ARM-Explainer -- Explaining and improving graph neural network predictions for the maximum clique problem using node features and association rule mining

Numerous graph neural network (GNN)-based algorithms have been proposed to solve graph-based combinatorial optimization problems (COPs), but methods to explain their predictions remain largely undeveloped. We introduce ARM-Explainer, a…

Machine Learning · Computer Science 2025-12-01 Bharat Sharman , Elkafi Hassini

CAPER: Coarsen, Align, Project, Refine - A General Multilevel Framework for Network Alignment

Network alignment, or the task of finding corresponding nodes in different networks, is an important problem formulation in many application domains. We propose CAPER, a multilevel alignment framework that Coarsens the input graphs, Aligns…

Social and Information Networks · Computer Science 2022-08-24 Jing Zhu , Danai Koutra , Mark Heimann

Coherence Pursuit: Fast, Simple, and Robust Principal Component Analysis

This paper presents a remarkably simple, yet powerful, algorithm termed Coherence Pursuit (CoP) to robust Principal Component Analysis (PCA). As inliers lie in a low dimensional subspace and are mostly correlated, an inlier is likely to…

Machine Learning · Computer Science 2017-11-28 Mostafa Rahmani , George Atia

TEASER: Fast and Certifiable Point Cloud Registration

We propose the first fast and certifiable algorithm for the registration of two sets of 3D points in the presence of large amounts of outlier correspondences. We first reformulate the registration problem using a Truncated Least Squares…

Robotics · Computer Science 2020-10-20 Heng Yang , Jingnan Shi , Luca Carlone

Adaptive Graph Refinement and Label Propagation with LLMs for Cost-Effective Entity Resolution

Dirty entity resolution (ER), which identifies records referring to the same real-world entity from a single, messy dataset, is a fundamental task in data management and mining. However, the dominant blocking-matching-clustering paradigm…

Computation and Language · Computer Science 2026-05-26 Hongtao Wang , Renchi Yang , Haoran Zheng , Xiangyu Ke

GLIMPS: A Greedy Mixed Integer Approach for Super Robust Matched Subspace Detection

Due to diverse nature of data acquisition and modern applications, many contemporary problems involve high dimensional datum $\x \in \R^\d$ whose entries often lie in a union of subspaces and the goal is to find out which entries of $\x$…

Machine Learning · Computer Science 2019-10-30 Md Mahfuzur Rahman , Daniel Pimentel-Alarcon

GLISTER: Generalization based Data Subset Selection for Efficient and Robust Learning

Large scale machine learning and deep models are extremely data-hungry. Unfortunately, obtaining large amounts of labeled data is expensive, and training state-of-the-art models (with hyperparameter tuning) requires significant computing…

Machine Learning · Computer Science 2021-06-15 Krishnateja Killamsetty , Durga Sivasubramanian , Ganesh Ramakrishnan , Rishabh Iyer

Outlier Elimination for Robust Ellipse and Ellipsoid Fitting

In this paper, an outlier elimination algorithm for ellipse/ellipsoid fitting is proposed. This two-stage algorithm employs a proximity-based outlier detection algorithm (using the graph Laplacian), followed by a model-based outlier…

Methodology · Statistics 2009-10-27 Jieqi Yu , Haipeng Zheng , Sanjeev R. Kulkarni , H. Vincent Poor

Cutting Through the Noise: On-the-fly Outlier Detection for Robust Training of Machine Learning Interatomic Potentials

The accuracy of machine learning interatomic potentials suffers from reference data that contains numerical noise. Often originating from unconverged or inconsistent electronic-structure calculations, this noise is challenging to identify.…

Machine Learning · Statistics 2026-02-10 Terry C. W. Lam , Niamh O'Neill , Christoph Schran , Lars L. Schaaf

A New Outlier Removal Strategy Based on Reliability of Correspondence Graph for Fast Point Cloud Registration

Registration is a basic yet crucial task in point cloud processing. In correspondence-based point cloud registration, matching correspondences by point feature techniques may lead to an extremely high outlier ratio. Current methods still…

Computer Vision and Pattern Recognition · Computer Science 2024-10-28 Li Yan , Pengcheng Wei , Hong Xie , Jicheng Dai , Hao Wu , Ming Huang

CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic Segmentation

Contrastive Language-Image Pre-training (CLIP) exhibits strong zero-shot classification ability on various image-level tasks, leading to the research to adapt CLIP for pixel-level open-vocabulary semantic segmentation without additional…

Computer Vision and Pattern Recognition · Computer Science 2024-11-22 Lin Sun , Jiale Cao , Jin Xie , Xiaoheng Jiang , Yanwei Pang

Robust Ellipse Fitting Based on Maximum Correntropy Criterion With Variable Center

The presence of outliers can significantly degrade the performance of ellipse fitting methods. We develop an ellipse fitting method that is robust to outliers based on the maximum correntropy criterion with variable center (MCC-VC), where a…

Computer Vision and Pattern Recognition · Computer Science 2023-05-17 Wei Wang , Gang Wang , Chenlong Hu , K. C. Ho

CLEAR: A Consistent Lifting, Embedding, and Alignment Rectification Algorithm for Multi-View Data Association

Many robotics applications require alignment and fusion of observations obtained at multiple views to form a global model of the environment. Multi-way data association methods provide a mechanism to improve alignment accuracy of pairwise…

Robotics · Computer Science 2020-03-06 Kaveh Fathian , Kasra Khosoussi , Yulun Tian , Parker Lusk , Jonathan P. How

Outlier Detection for Improved Data Quality and Diversity in Dialog Systems

In a corpus of data, outliers are either errors: mistakes in the data that are counterproductive, or are unique: informative samples that improve model robustness. Identifying outliers can lead to better datasets by (1) removing noise in…

Computation and Language · Computer Science 2019-04-08 Stefan Larson , Anish Mahendran , Andrew Lee , Jonathan K. Kummerfeld , Parker Hill , Michael A. Laurenzano , Johann Hauswald , Lingjia Tang , Jason Mars

Fast Algorithms for the Maximum Clique Problem on Massive Graphs with Applications to Overlapping Community Detection

The maximum clique problem is a well known NP-Hard problem with applications in data mining, network analysis, information retrieval and many other areas related to the World Wide Web. There exist several algorithms for the problem with…

Data Structures and Algorithms · Computer Science 2014-12-01 Bharath Pattabiraman , Md. Mostofa Ali Patwary , Assefaw H. Gebremedhin , Wei-keng Liao , Alok Choudhary

Robust Subspace Clustering via Thresholding

The problem of clustering noisy and incompletely observed high-dimensional data points into a union of low-dimensional subspaces and a set of outliers is considered. The number of subspaces, their dimensions, and their orientations are…

Machine Learning · Statistics 2015-08-24 Reinhard Heckel , Helmut Bölcskei