English

Classifying with Uncertain Data Envelopment Analysis

Optimization and Control 2022-09-05 v1 Computational Engineering, Finance, and Science Machine Learning

Abstract

Classifications organize entities into categories that identify similarities within a category and discern dissimilarities among categories, and they powerfully classify information in support of analysis. We propose a new classification scheme premised on the reality of imperfect data. Our computational model uses uncertain data envelopment analysis to define a classification's proximity to equitable efficiency, which is an aggregate measure of intra-similarity within a classification's categories. Our classification process has two overriding computational challenges, those being a loss of convexity and a combinatorially explosive search space. We overcome the first by establishing lower and upper bounds on the proximity value, and then by searching this range with a first-order algorithm. We overcome the second by adapting the p-median problem to initiate our exploration, and by then employing an iterative neighborhood search to finalize a classification. We conclude by classifying the thirty stocks in the Dow Jones Industrial average into performant tiers and by classifying prostate treatments into clinically effectual categories.

Keywords

Cite

@article{arxiv.2209.01052,
  title  = {Classifying with Uncertain Data Envelopment Analysis},
  author = {Casey Garner and Allen Holder},
  journal= {arXiv preprint arXiv:2209.01052},
  year   = {2022}
}

Comments

21 pages, 6 figures

R2 v1 2026-06-28T00:38:17.772Z