Algorithmically Establishing Trust in Evaluators

Adrian de Wynter

Algorithmically Establishing Trust in Evaluators

Data Structures and Algorithms 2026-02-12 v4 Artificial Intelligence Computation and Language

Authors: Adrian de Wynter

Abstract

An evaluator, such as an LLM-as-a-judge, is trustworthy when there exists some agreed-upon way to measure its performance as a labeller. Traditional approaches either rely on testing the evaluator against references or assume that it `knows' somehow the correct labelling. Both approaches fail when references are unavailable: the former requires data, and the latter is an assumption, not evidence. To address this, we introduce the `No-Data Algorithm', which provably establishes trust in an evaluator without requiring any labelled data. Our algorithm works by successively posing challenges to said evaluator. We prove that after $r$ challenge rounds, it accepts an evaluator which knows the correct labels with probability $\geq 1 - (1/4)^r$ , and reliably flags untrustworthy ones. We present formal proofs of correctness, empirical tests, and applications to assessing trust in LLMs-as-judges for low-resource language labelling. Our work enables scientifically-grounded evaluator trust in low-data domains, addressing a critical bottleneck for scalable, trustworthy LLM deployment.

Keywords

benchmark evaluation theorem proving evaluation metrics

Cite

@article{arxiv.2506.03083,
  title  = {Algorithmically Establishing Trust in Evaluators},
  author = {Adrian de Wynter},
  journal= {arXiv preprint arXiv:2506.03083},
  year   = {2026}
}

Algorithmically Establishing Trust in Evaluators

Abstract

Keywords

Cite

Related papers