English
Related papers

Related papers: Correct after Answer: Enhancing Multi-Span Questio…

200 papers

Multiple-choice question answering (MCQA) is a key competence of performant transformer language models that is tested by mainstream benchmarks. However, recent evidence shows that models can have quite a range of performance, particularly…

Computation and Language · Computer Science 2025-03-11 Sarah Wiegreffe , Oyvind Tafjord , Yonatan Belinkov , Hannaneh Hajishirzi , Ashish Sabharwal

One of the most widely used tasks for evaluating Large Language Models (LLMs) is Multiple-Choice Question Answering (MCQA). While open-ended question answering tasks are more challenging to evaluate, MCQA tasks are, in principle, easier to…

Computation and Language · Computer Science 2025-06-10 Francesco Maria Molfese , Luca Moroni , Luca Gioffré , Alessandro Scirè , Simone Conia , Roberto Navigli

Prediction systems are successfully deployed in applications ranging from disease diagnosis, to predicting credit worthiness, to image recognition. Even when the overall accuracy is high, these systems may exhibit systematic biases that…

Machine Learning · Computer Science 2018-08-30 Michael P. Kim , Amirata Ghorbani , James Zou

Complex knowledge base question answering can be achieved by converting questions into sequences of predefined actions. However, there is a significant semantic and structural gap between natural language and action sequences, which makes…

Computation and Language · Computer Science 2022-12-27 Yechun Tang , Xiaoxia Cheng , Weiming Lu

Sequential recommender models typically generate predictions in a single step during testing, without considering additional prediction correction to enhance performance as humans would. To improve the accuracy of these models, some…

Information Retrieval · Computer Science 2023-04-28 Yulong Huang , Yang Zhang , Qifan Wang , Chenxu Wang , Fuli Feng

Deep neural networks (DNNs) have made great strides in pushing the state-of-the-art in several challenging domains. Recent studies reveal that they are prone to making overconfident predictions. This greatly reduces the overall trust in…

Computer Vision and Pattern Recognition · Computer Science 2023-09-07 Vinith Kugathasan , Muhammad Haris Khan

Models for reading comprehension (RC) commonly restrict their output space to the set of all single contiguous spans from the input, in order to alleviate the learning problem and avoid the need for a model that generates text explicitly.…

Computation and Language · Computer Science 2020-10-06 Elad Segal , Avia Efrat , Mor Shoham , Amir Globerson , Jonathan Berant

A trending paradigm for multiple-choice question answering (MCQA) is using a text-to-text framework. By unifying data in different tasks into a single text-to-text format, it trains a generative encoder-decoder model which is both powerful…

Computation and Language · Computer Science 2022-05-03 Zixian Huang , Ao Wu , Jiaying Zhou , Yu Gu , Yue Zhao , Gong Cheng

Uncertainty estimates must be calibrated (i.e., accurate) and sharp (i.e., informative) in order to be useful. This has motivated a variety of methods for recalibration, which use held-out data to turn an uncalibrated model into a…

Machine Learning · Computer Science 2022-07-06 Charles Marx , Shengjia Zhao , Willie Neiswanger , Stefano Ermon

Conversational Question Answering (ConvQA) models aim at answering a question with its relevant paragraph and previous question-answer pairs that occurred during conversation multiple times. To apply such models to a real-world scenario,…

Computation and Language · Computer Science 2023-02-13 Soyeong Jeong , Jinheon Baek , Sung Ju Hwang , Jong C. Park

Answer validation in machine reading comprehension (MRC) consists of verifying an extracted answer against an input context and question pair. Previous work has looked at re-assessing the "answerability" of the question given the extracted…

Computation and Language · Computer Science 2020-11-09 Revanth Gangi Reddy , Md Arafat Sultan , Efsun Sarioglu Kayi , Rong Zhang , Vittorio Castelli , Avirup Sil

Multiple-choice question answering (MCQA) is easy to evaluate but adds a meta-task: models must both solve the problem and output the symbol that *represents* the answer, conflating reasoning errors with symbol-binding failures. We study…

Computation and Language · Computer Science 2026-01-08 Hugh Mee Wong , Rick Nouwen , Albert Gatt

Attention is typically used to select informative sub-phrases that are used for prediction. This paper investigates the novel use of attention as a form of feature augmentation, i.e, casted attention. We propose Multi-Cast Attention…

Computation and Language · Computer Science 2018-06-05 Yi Tay , Luu Anh Tuan , Siu Cheung Hui

Modern systems for multi-hop question answering (QA) typically break questions into a sequence of reasoning steps, termed chain-of-thought (CoT), before arriving at a final answer. Often, multiple chains are sampled and aggregated through a…

Computation and Language · Computer Science 2024-08-05 Ori Yoran , Tomer Wolfson , Ben Bogin , Uri Katz , Daniel Deutch , Jonathan Berant

Reasoning quality in large language models depends not only on producing correct answers but also on generating valid intermediate steps. We study this through multiple-choice question answering (MCQA), which provides a controlled setting…

Artificial Intelligence · Computer Science 2025-10-01 Raphael Schumann , Stefan Riezler

Textbook Question Answering (TQA) is a complex multimodal task to infer answers given large context descriptions and abundant diagrams. Compared with Visual Question Answering (VQA), TQA contains a large number of uncommon terminologies and…

Multimedia · Computer Science 2021-12-07 Fangzhi Xu , Qika Lin , Jun Liu , Lingling Zhang , Tianzhe Zhao , Qi Chai , Yudai Pan

Large Language Model based multi-agent systems (MAS) excel at collaborative problem solving but remain brittle to cascading errors: a single faulty step can propagate across agents and disrupt the trajectory. In this paper, we present MASC,…

The recent explosion of question answering (QA) datasets and models has increased the interest in the generalization of models across multiple domains and formats by either training on multiple datasets or by combining multiple models.…

Computation and Language · Computer Science 2023-02-08 Haritz Puerto , Gözde Gül Şahin , Iryna Gurevych

The task of multiple choice question answering (MCQA) refers to identifying a suitable answer from multiple candidates, by estimating the matching score among the triple of the passage, question and answer. Despite the general research…

Computation and Language · Computer Science 2021-09-28 Xun Yao , Junlong Ma , Xinrong Hu , Junping Liu , Jie Yang , Wanqing Li

When evaluating large language models (LLMs) with multiple-choice question answering (MCQA), it is common to end the prompt with the string "Answer:" to facilitate automated answer extraction via next-token probabilities. However, there is…

Computation and Language · Computer Science 2025-09-19 Mario Sanz-Guerrero , Minh Duc Bui , Katharina von der Wense
‹ Prev 1 2 3 10 Next ›