Related papers: Estimating Consensus Ideal Points Using Multi-Sour…
This paper introduces "Semantic Scaling," a novel method for ideal point estimation from text. I leverage large language models to classify documents based on their expressed stances and extract survey-like data. I then use item response…
This article deals with the analysis of high dimensional data that come from multiple sources (experiments) and thus have different possibly correlated responses, but share the same set of predictors. The measurements of the predictors may…
Usually, opinion formation models assume that individuals have an opinion about a given topic which can change due to interactions with others. However, individuals can have different opinions in different topics and therefore n-dimensional…
Conjoint analysis, an application of factorial experimental design, is a popular tool in social science research for studying multidimensional preferences. In such political analysis experiments, respondents are often asked to choose…
Most Machine Learning (ML) methods, from clustering to classification, rely on a distance function to describe relationships between datapoints. For complex datasets it is hard to avoid making some arbitrary choices when defining a distance…
Multidimensional scaling (MDS) is a dimensionality reduction tool used for information analysis, data visualization and manifold learning. Most MDS procedures embed data points in low-dimensional Euclidean (flat) domains, such that…
Multimodal sentiment analysis has gained significant attention due to the proliferation of multimodal content on social media. However, existing studies in this area rely heavily on large-scale supervised data, which is time-consuming and…
The input data features set for many data driven tasks is high-dimensional while the intrinsic dimension of the data is low. Data analysis methods aim to uncover the underlying low dimensional structure imposed by the low dimensional hidden…
This paper is concerned with the probability of consensus in a multivariate, spatially explicit version of the Hegselmann-Krause model for the dynamics of opinions. Individuals are located on the vertices of a finite connected graph…
Class imbalance and distributional differences in large datasets present significant challenges for classification tasks machine learning, often leading to biased models and poor predictive performance for minority classes. This work…
Multidimensional scaling (MDS) is a popular technique for mapping a finite metric space into a low-dimensional Euclidean space in a way that best preserves pairwise distances. We study a notion of MDS on infinite metric measure spaces,…
A plethora of dimension reduction methods have been developed to visualize high-dimensional data in low dimensions. However, different dimension reduction methods often output different and possibly conflicting visualizations of the same…
People's opinions on a wide range of topics often evolve over time through their interactions with others. Models of opinion dynamics primarily focus on one-dimensional opinions, which represent opinions on one topic. However, opinions on…
This article is motivated by the objective of providing a new analytically tractable and fully frequentist framework to characterize and implement regression trees while also allowing a multivariate (potentially high dimensional) response.…
This paper reports on the state-of-the-art in application of multidimensional scaling (MDS) techniques to create semantic maps in linguistic research. MDS refers to a statistical technique that represents objects (lexical items, linguistic…
Mutual information (MI) is a fundamental measure of statistical dependence between two variables, yet accurate estimation from finite data remains notoriously difficult. No estimator is universally reliable, and common approaches fail in…
A ranking is an ordered sequence of items, in which an item with higher ranking score is more preferred than the items with lower ranking scores. In many information systems, rankings are widely used to represent the preferences over a set…
For distributed estimations in a sensor network, the consistency and accuracy of an estimator are greatly affected by the unknown correlations between individual estimates. An inconsistent or too conservative estimate may degrade the…
The evaluation of Indoor Positioning Systems (IPS) mostly relies on local deployments in the researchers' or partners' facilities. The complexity of preparing comprehensive experiments, collecting data, and considering multiple scenarios…
An energy efficient use of large scale sensor networks necessitates activating a subset of possible sensors for estimation at a fusion center. The problem is inherently combinatorial; to this end, a set of iterative, randomized algorithms…