English
Related papers

Related papers: DEEP: Docker-based Execution and Evaluation Platfo…

200 papers

With the rapid development of Large Language Models (LLMs), a large number of benchmarks have been proposed. However, most benchmarks lack unified evaluation standard and require the manual implementation of custom scripts, making results…

With the success of deep learning techniques in a broad range of application domains, many deep learning software frameworks have been developed and are being updated frequently to adapt to new hardware features and software libraries,…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-11-10 Pengfei Xu , Shaohuai Shi , Xiaowen Chu

The DEEP projects have developed a variety of hardware and software technologies aiming at improving the efficiency and usability of next generation high-performance computers. They evolve around an innovative concept for heterogeneous…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-04-11 Anke Kreuzer , Jorge Amaya , Norbert Eicker , Estela Suarez

Measuring the confidence of AI models is critical for safely deploying AI in real-world industrial systems. One important application of confidence measurement is information extraction from scanned documents. However, there exists no…

Information Retrieval · Computer Science 2022-10-11 Bao-Sinh Nguyen , Quang-Bach Tran , Tuan-Anh Nguyen Dang , Duc Nguyen , Hung Le

Efficiently merging several models fine-tuned for different tasks, but stemming from the same pretrained base model, is of great practical interest. Despite extensive prior work, most evaluations of model merging in computer vision are…

Computer Vision and Pattern Recognition · Computer Science 2026-04-15 Pau de Jorge , César Roberto de Souza , Björn Michele , Mert Bülent Sarıyıldız , Philippe Weinzaepfel , Florent Perronnin , Diane Larlus , Yannis Kalantidis

Deep learning based recommendation systems form the backbone of most personalized cloud services. Though the computer architecture community has recently started to take notice of deep recommendation inference, the resulting solutions have…

Hardware Architecture · Computer Science 2020-10-13 Samuel Hsia , Udit Gupta , Mark Wilkening , Carole-Jean Wu , Gu-Yeon Wei , David Brooks

Embedders play a central role in machine learning, projecting any object into numerical representations that can, in turn, be leveraged to perform various downstream tasks. The evaluation of embedding models typically depends on…

Machine Learning · Computer Science 2024-11-19 Maxime Darrin , Philippe Formont , Ismail Ben Ayed , Jackie CK Cheung , Pablo Piantanida

Recent advances in large language models have enabled deep research systems that generate expert-level reports through multi-step reasoning and evidence-based synthesis. However, evaluating such reports remains challenging: report quality…

Computation and Language · Computer Science 2026-03-11 Janghoon Han , Heegyu Kim , Changho Lee , Dahm Lee , Min Hyung Park , Hosung Song , Stanley Jungkyu Choi , Moontae Lee , Honglak Lee

This paper shows that further evaluation metrics during model training are needed to decide about its applicability in inference. As an example, a LayoutLM-based model is trained for token classification in documents. The documents are…

Computer Vision and Pattern Recognition · Computer Science 2025-04-03 Anket Mehra , Malte Prieß , Marian Himstedt

The field of deep clustering combines deep learning and clustering to learn representations that improve both the learned representation and the performance of the considered clustering method. Most existing deep clustering methods are…

Machine Learning · Computer Science 2023-02-22 Lukas Miklautz , Martin Teuffenbach , Pascal Weber , Rona Perjuci , Walid Durani , Christian Böhm , Claudia Plant

Deep learning (DL) models have become core modules for many applications. However, deploying these models without careful performance benchmarking that considers both hardware and software's impact often leads to poor service and costly…

Machine Learning · Computer Science 2021-01-06 Huaizheng Zhang , Yizheng Huang , Yonggang Wen , Jianxiong Yin , Kyle Guan

Recent progress in deep research systems has been impressive, but evaluation still lags behind real user needs. Existing benchmarks predominantly assess final reports using fixed rubrics, failing to evaluate the underlying research process.…

We describe a system that simplifies the process of debugging programs produced by computer-aided parallelization tools. The system uses relative debugging techniques to compare serial and parallel executions in order to show where the…

Software Engineering · Computer Science 2007-05-23 Robert Hood , Gabriele Jost

The objective of many real-world tasks is complex and difficult to procedurally specify. This makes it necessary to use reward or imitation learning algorithms to infer a reward or policy directly from human data. Existing benchmarks for…

Machine Learning · Computer Science 2020-12-03 Pedro Freire , Adam Gleave , Sam Toyer , Stuart Russell

Clustering is a fundamental learning task widely used as a first step in data analysis. For example, biologists use cluster assignments to analyze genome sequences, medical records, or images. Since downstream analysis is typically…

Machine Learning · Computer Science 2024-06-11 Jonathan Svirsky , Ofir Lindenbaum

The evaluation of clustering algorithms can involve running them on a variety of benchmark problems, and comparing their outputs to the reference, ground-truth groupings provided by experts. Unfortunately, many research papers and graduate…

Machine Learning · Computer Science 2023-10-27 Marek Gagolewski

Rigorous and reproducible evaluation is critical for assessing the state of the art and for guiding scientific advances in Artificial Intelligence. Evaluation is challenging in practice due to several reasons, including benchmark…

Comparing and aligning large datasets is a pervasive problem occurring across many different knowledge domains. We introduce and study MREC, a recursive decomposition algorithm for computing matchings between data sets. The basic idea is to…

Machine Learning · Statistics 2020-02-24 Andrew J. Blumberg , Mathieu Carriere , Michael A. Mandell , Raul Rabadan , Soledad Villar

Image clustering is one of the most important computer vision applications, which has been extensively studied in literature. However, current clustering methods mostly suffer from lack of efficiency and scalability when dealing with…

Machine Learning · Computer Science 2017-08-10 Kamran Ghasedi Dizaji , Amirhossein Herandi , Cheng Deng , Weidong Cai , Heng Huang

Process mining offers techniques to exploit event data by providing insights and recommendations to improve business processes. The growing amount of algorithms for process discovery has raised the question of which algorithms perform best…

Software Engineering · Computer Science 2018-06-20 Toon Jouck , Alfredo Bolt , Benoît Depaire , Massimiliano de Leoni , Wil M. P. van der Aalst
‹ Prev 1 2 3 10 Next ›