Michael Mathioudakis

Matching Meaning at Scale: Evaluating Semantic Search for 18th-Century Intellectual History through the Case of Locke

While digitized corpora have transformed the study of intellectual transmission, current methods rely heavily on lexical text reuse detection, capturing verbatim quotations but fundamentally missing paraphrases and complex implicit…

Computation and Language · Computer Science 2026-05-13 Yu Wu , Ananth Mahadevan , Filip Ginter , Michael Mathioudakis , Mikko Tolonen

WaZI: A Learned and Workload-aware Z-Index

Learned indexes fit machine learning (ML) models to the data and use them to make query operations more time and space-efficient. Recent works propose using learned spatial indexes to improve spatial query performance by optimizing the…

Databases · Computer Science 2024-03-21 Sachith Pai , Michael Mathioudakis , Yanhao Wang

Optimizing a Data Science System for Text Reuse Analysis

Text reuse is a methodological element of fundamental importance in humanities research: pieces of text that re-appear across different documents, verbatim or paraphrased, provide invaluable information about the historical spread and…

Databases · Computer Science 2024-01-17 Ananth Mahadevan , Michael Mathioudakis , Eetu Mäkelä , Mikko Tolonen

Cost-Effective Retraining of Machine Learning Models

It is important to retrain a machine learning (ML) model in order to maintain its performance as the data changes over time. However, this can be costly as it usually requires processing the entire dataset again. This creates a trade-off…

Machine Learning · Computer Science 2023-10-09 Ananth Mahadevan , Michael Mathioudakis

Max-Min Diversification with Fairness Constraints: Exact and Approximation Algorithms

Diversity maximization aims to select a diverse and representative subset of items from a large dataset. It is a fundamental optimization task that finds applications in data summarization, feature selection, web search, recommender…

Data Structures and Algorithms · Computer Science 2023-04-27 Yanhao Wang , Michael Mathioudakis , Jia Li , Francesco Fabbri

Streaming Algorithms for Diversity Maximization with Fairness Constraints

Diversity maximization is a fundamental problem with wide applications in data summarization, web search, and recommender systems. Given a set $X$ of $n$ elements, it asks to select a subset $S$ of $k \ll n$ elements with maximum…

Data Structures and Algorithms · Computer Science 2023-04-27 Yanhao Wang , Francesco Fabbri , Michael Mathioudakis

Rewiring What-to-Watch-Next Recommendations to Reduce Radicalization Pathways

Recommender systems typically suggest to users content similar to what they consumed in the past. If a user happens to be exposed to strongly polarized content, she might subsequently receive recommendations which may steer her towards more…

Computers and Society · Computer Science 2023-04-27 Francesco Fabbri , Yanhao Wang , Francesco Bonchi , Carlos Castillo , Michael Mathioudakis

Graph Summarization via Node Grouping: A Spectral Algorithm

Graph summarization via node grouping is a popular method to build concise graph representations by grouping nodes from the original graph into supernodes and encoding edges into superedges such that the loss of adjacency information is…

Social and Information Networks · Computer Science 2022-11-09 Arpit Merchant , Michael Mathioudakis , Yanhao Wang

Workload-Aware Materialization of Junction Trees

Bayesian networks are popular probabilistic models that capture the conditional dependencies among a set of variables. Inference in Bayesian networks is a fundamental task for answering probabilistic queries over a subset of variables in…

Databases · Computer Science 2021-10-08 Martino Ciaperoni , Cigdem Aslay , Aristides Gionis , Michael Mathioudakis

Joint Use of Node Attributes and Proximity for Semi-Supervised Classification on Graphs

The task of node classification is to infer unknown node labels, given the labels for some of the nodes along with the network structure and other node attributes. Typically, approaches for this task assume homophily, whereby neighboring…

Social and Information Networks · Computer Science 2021-09-15 Arpit Merchant , Michael Mathioudakis

Certifiable Machine Unlearning for Linear Models

Machine unlearning is the task of updating machine learning (ML) models after a subset of the training data they were trained on is deleted. Methods for the task are desired to combine effectiveness and efficiency, i.e., they should…

Machine Learning · Computer Science 2021-08-17 Ananth Mahadevan , Michael Mathioudakis

Affirmative Action Policies for Top-k Candidates Selection, With an Application to the Design of Policies for University Admissions

We consider the problem of designing affirmative action policies for selecting the top-k candidates from a pool of applicants. We assume that for each candidate we have socio-demographic attributes and a series of variables that serve as…

Computers and Society · Computer Science 2021-03-10 Michael Mathioudakis , Carlos Castillo , Giorgio Barnabo , Sergio Celis

Intersectional Affirmative Action Policies for Top-k Candidates Selection

We study the problem of selecting the top-k candidates from a pool of applicants, where each candidate is associated with a score indicating his/her aptitude. Depending on the specific scenario, such as job search or college admissions,…

Computers and Society · Computer Science 2021-03-08 Giorgio Barnabo' , Carlos Castillo , Michael Mathioudakis , Sergio Celis

Fair and Representative Subset Selection from Data Streams

We study the problem of extracting a small subset of representative items from a large data stream. In many data mining and machine learning applications such as social network analysis and recommender systems, this problem can be…

Data Structures and Algorithms · Computer Science 2021-02-15 Yanhao Wang , Francesco Fabbri , Michael Mathioudakis

Query the model: precomputations for efficient inference with Bayesian Networks

Variable Elimination is a fundamental algorithm for probabilistic inference over Bayesian networks. In this paper, we propose a novel materialization method for Variable Elimination, which can lead to significant efficiency gains when…

Databases · Computer Science 2021-01-29 Cigdem Aslay , Martino Ciaperoni , Aristides Gionis , Michael Mathioudakis

GRMR: Generalized Regret-Minimizing Representatives

Extracting a small subset of representative tuples from a large database is an important task in multi-criteria decision making. The regret-minimizing set (RMS) problem is recently proposed for representative discovery from databases.…

Data Structures and Algorithms · Computer Science 2020-07-21 Yanhao Wang , Michael Mathioudakis , Yuchen Li , Kian-Lee Tan

Towards Data-Driven Affirmative Action Policies under Uncertainty

In this paper, we study university admissions under a centralized system that uses grades and standardized test scores to match applicants to university programs. We consider affirmative action policies that seek to increase the number of…

Computers and Society · Computer Science 2020-07-03 Corinna Hertweck , Carlos Castillo , Michael Mathioudakis

Markov Chain Monitoring

In networking applications, one often wishes to obtain estimates about the number of objects at different parts of the network (e.g., the number of cars at an intersection of a road network or the number of packets expected to reach a node…

Social and Information Networks · Computer Science 2020-06-22 Harshal A. Chaudhari , Michael Mathioudakis , Evimaria Terzi

Reducing Controversy by Connecting Opposing Views

Society is often polarized by controversial issues, that split the population into groups of opposing views. When such issues emerge on social media, we often observe the creation of 'echo chambers', i.e., situations where like-minded…

Social and Information Networks · Computer Science 2018-05-25 Kiran Garimella , Gianmarco De Francisci Morales , Aristides Gionis , Michael Mathioudakis

Political Discourse on Social Media: Echo Chambers, Gatekeepers, and the Price of Bipartisanship

Echo chambers, i.e., situations where one is exposed only to opinions that agree with their own, are an increasing concern for the political discourse in many democratic countries. This paper studies the phenomenon of political echo…

Social and Information Networks · Computer Science 2018-02-20 Kiran Garimella , Gianmarco De Francisci Morales , Aristides Gionis , Michael Mathioudakis