Related papers: Efficient Probabilistic Inference with Partial Ran…
Distributions over rankings are used to model data in a multitude of real world settings such as preference analysis and political elections. Modeling such distributions presents several computational challenges, however, due to the…
Representing distributions over permutations can be a daunting task due to the fact that the number of permutations of $n$ objects scales factorially in $n$. One recent way that has been used to reduce storage complexity has been to exploit…
In this paper we extend the principle of proportional representation to rankings. We consider the setting where alternatives need to be ranked based on approval preferences. In this setting, proportional representation requires that…
Performing effective preference-based data retrieval requires detailed and preferentially meaningful structurized information about the current user as well as the items under consideration. A common problem is that representations of items…
The dramatic growth in the number of application domains that naturally generate probabilistic, uncertain data has resulted in a need for efficiently supporting complex querying and decision-making over such data. In this paper, we present…
The analysis of decision making under uncertainty is closely related to the analysis of probabilistic inference. Indeed, much of the research into efficient methods for probabilistic inference in expert systems has been motivated by the…
Many latent (factorized) models have been proposed for recommendation tasks like collaborative filtering and for ranking tasks like document or image retrieval and annotation. Common to all those methods is that during inference the items…
The analysis of practical probabilistic models on the computer demands a convenient representation for the available knowledge and an efficient algorithm to perform inference. An appealing representation is the influence diagram, a network…
While numerous studies have been conducted in the literature exploring different types of machine learning approaches for search ranking, most of them are focused on specific pre-defined problems but only a few of them have studied the…
Ranking, and inferences based on ranking of a set of entities, are important problems in numerous contexts. This is especially true in small area statistics where there may be only a limited amount of directly observed data from each entity…
Many statistical problems in causal inference involve a probability distribution other than the one from which data are actually observed; as an additional complication, the object of interest is often a marginal quantity of this other…
Matrix factorization is a key component of collaborative filtering-based recommendation systems because it allows us to complete sparse user-by-item ratings matrices under a low-rank assumption that encodes the belief that similar users…
Rankings derived from pairwise comparisons are central to many economic and computational systems. In the context of large language models (LLMs), rankings are typically constructed from human preference data and presented as leaderboards…
Many sorts of structured data are commonly stored in a multi-relational format of interrelated tables. Under this relational model, exploratory data analysis can be done by using relational queries. As an example, in the Internet Movie…
We consider sequential or active ranking of a set of n items based on noisy pairwise comparisons. Items are ranked according to the probability that a given item beats a randomly chosen item, and ranking refers to partitioning the items…
Proportional ranking rules aggregate approval-style preferences of agents into a collective ranking such that groups of agents with similar preferences are adequately represented. Motivated by the application of live Q&A platforms, where…
Estimating the dependences between random variables, and ranking them accordingly, is a prevalent problem in machine learning. Pursuing frequentist and information-theoretic approaches, we first show that the p-value and the mutual…
Fractional scoring has been proposed to avoid inconsistencies in the attribution of publications to percentile rank classes. Uncertainties and ambiguities in the evaluation of percentile ranks can be demonstrated most easily with small…
Structured distributions, i.e. distributions over combinatorial spaces, are commonly used to learn latent probabilistic representations from observed data. However, scaling these models is bottlenecked by the high computational and memory…
Instead of testing for unanimous agreement, I propose learning how broad of a consensus favors one distribution over another (of earnings, productivity, asset returns, test scores, etc.). Specifically, given a sample from each of two…