English
Related papers

Related papers: PerfXplain: Debugging MapReduce Job Performance

200 papers

Explainability of a classification model is crucial when deployed in real-world decision support systems. Explanations make predictions actionable to the user and should inform about the capabilities and limitations of the system. Existing…

Machine Learning · Computer Science 2022-12-13 Erwin Walraven , Ajaya Adhikari , Cor J. Veenman

We present GenEx, a generative model to explain search results to users beyond just showing matches between query and document words. Adding GenEx explanations to search results greatly impacts user satisfaction and search performance.…

Information Retrieval · Computer Science 2021-11-03 Razieh Rahimi , Youngwoo Kim , Hamed Zamani , James Allan

Hadoop MapReduce is now a popular choice for performing large-scale data analytics. This technical report describes a detailed set of mathematical performance models for describing the execution of a MapReduce job on Hadoop. The models…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-06-07 Herodotos Herodotou

Understanding and predicting the performance of big data applications running in the cloud or on-premises could help minimise the overall cost of operations and provide opportunities in efforts to identify performance bottlenecks. The…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-26 Sheriffo Ceesay , Adam Barker , Yuhui Lin

Retrieval-Augmented Generation (RAG) systems couple large language models with external knowledge, yet most evaluation methods report aggregate scores that reveal whether a pipeline underperforms but not where or why. We introduce…

Information Retrieval · Computer Science 2026-03-19 Dvir Cohen , Tamir Houri , Lin Burg , Gilad Barkan

In hybrid transactional and analytical processing (HTAP) systems, users often struggle to understand why query plans from one engine (OLAP or OLTP) perform significantly slower than those from another. Although optimizers provide plan…

Databases · Computer Science 2024-12-03 Haibo Xiu , Li Zhang , Tieying Zhang , Jun Yang , Jianjun Chen

The advancement of Large Language Model (LLM)-powered agents has enabled automated task processing through reasoning and tool invocation capabilities. However, existing frameworks often operate under the idealized assumption that tool…

Artificial Intelligence · Computer Science 2026-03-06 Zhipeng Chen , Zhongrui Zhang , Chao Zhang , Yifan Xu , Lan Yang , Jun Liu , Ke Li , Yi-Zhe Song

Explanations of an AI's function can assist human decision-makers, but the most useful explanation depends on the decision's context, referred to as the downstream task. User studies are necessary to determine the best explanations for each…

Human-Computer Interaction · Computer Science 2024-09-20 Eura Nofshin , Esther Brown , Brian Lim , Weiwei Pan , Finale Doshi-Velez

Data-driven optimization uses contextual information and machine learning algorithms to find solutions to decision problems with uncertain parameters. While a vast body of work is dedicated to interpreting machine learning models in the…

Machine Learning · Computer Science 2023-07-21 Alexandre Forel , Axel Parmentier , Thibaut Vidal

Distributed computing frameworks such as MapReduce are often used to process large computational jobs. They operate by partitioning each job into smaller tasks executed on different servers. The servers also need to exchange intermediate…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-20 Konstantinos Konstantinidis , Aditya Ramamoorthy

A class of explainable NLP models for reasoning tasks support their decisions by generating free-form or structured explanations, but what happens when these supporting structures contain errors? Our goal is to allow users to interactively…

Computation and Language · Computer Science 2021-04-20 Aman Madaan , Niket Tandon , Dheeraj Rajagopal , Yiming Yang , Peter Clark , Keisuke Sakaguchi , Ed Hovy

Large language models (LLMs) can often generate functionally correct code, but their ability to produce efficient implementations for performance-critical systems tasks remains limited. Existing code benchmarks mainly emphasize correctness…

Software Engineering · Computer Science 2026-05-18 Huihao Jing , Wenbin Hu , Haochen Shi , Hanyu Yang , Sirui Zhang , Shaojin Chen , Haoran Li , Yangqiu Song

Large language models (LLMs) have achieved remarkable progress in automatic code generation, yet their ability to produce high-performance code remains limited--a critical requirement in real-world software systems. We argue that current…

Software Engineering · Computer Science 2026-05-11 Jiuding Yang , Shengyao Lu , Hongxuan Liu , Shayan Shirahmad Gale Bagi , Zahra Fazel , Tomasz Czajkowski , Di Niu

In this work, we investigate whether small language models can determine high-quality subsets of large-scale text datasets that improve the performance of larger language models. While existing work has shown that pruning based on the…

Machine Learning · Computer Science 2024-06-03 Zachary Ankner , Cody Blakeney , Kartik Sreenivasan , Max Marion , Matthew L. Leavitt , Mansheej Paul

Responsible use of machine learning requires models to be audited for undesirable properties. While a body of work has proposed using explanations for auditing, how to do so and why has remained relatively ill-understood. This work…

Machine Learning · Computer Science 2023-06-06 Chhavi Yadav , Michal Moshkovitz , Kamalika Chaudhuri

Designing and implementing efficient, provably correct parallel machine learning (ML) algorithms is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and…

Machine Learning · Computer Science 2014-08-12 Yucheng Low , Joseph E. Gonzalez , Aapo Kyrola , Danny Bickson , Carlos E. Guestrin , Joseph Hellerstein

Inferring causal relations in timeseries data with delayed effects is a fundamental challenge, especially when the underlying system exhibits complex dynamics that cannot be captured by simple functional mappings. Traditional approaches…

Machine Learning · Computer Science 2026-02-23 Preetom Biswas , Giulia Pedrielli , K. Selçuk Candan

Designing and implementing efficient, provably correct parallel machine learning (ML) algorithms is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and…

Machine Learning · Computer Science 2010-06-28 Yucheng Low , Joseph Gonzalez , Aapo Kyrola , Danny Bickson , Carlos Guestrin , Joseph M. Hellerstein

Aggregated time series are generated effortlessly everywhere, e.g., "total confirmed covid-19 cases since 2019" and "total liquor sales over time." Understanding "how" and "why" these key performance indicators (KPI) evolve over time is…

Databases · Computer Science 2022-11-22 Yiru Chen , Silu Huang

Despite the increasing use of large language models (LLMs) for context-grounded tasks like summarization and question-answering, understanding what makes an LLM produce a certain response is challenging. We propose Multi-Level Explanations…

‹ Prev 1 2 3 10 Next ›