Ting-Rui Chiang — Scifaro

Pelican Soup Framework: A Theoretical Framework for Language Model Capabilities

In this work, we propose a simple theoretical framework, Pelican Soup, aiming to better understand how pretraining allows LLMs to (1) generalize to unseen instructions and (2) perform in-context learning, even when the verbalizers are…

Computation and Language · Computer Science 2026-01-09 Ting-Rui Chiang , Dani Yogatama

The Rotary Position Embedding May Cause Dimension Inefficiency in Attention Heads for Long-Distance Retrieval

The Rotary Position Embedding (RoPE) is widely used in the attention heads of many large language models (LLM). It rotates dimensions in the query and the key vectors by different angles according to their positions in the input sequence.…

Computation and Language · Computer Science 2025-02-18 Ting-Rui Chiang , Dani Yogatama

LocateBench: Evaluating the Locating Ability of Vision Language Models

The ability to locate an object in an image according to natural language instructions is crucial for many real-world applications. In this work we propose LocateBench, a high-quality benchmark dedicated to evaluating this ability. We…

Computer Vision and Pattern Recognition · Computer Science 2024-10-29 Ting-Rui Chiang , Joshua Robinson , Xinyan Velocity Yu , Dani Yogatama

On Retrieval Augmentation and the Limitations of Language Model Training

Augmenting a language model (LM) with $k$-nearest neighbors ($k$NN) retrieval on its training data alone can decrease its perplexity, though the underlying reasons for this remain elusive. In this work, we rule out one previously posited…

Computation and Language · Computer Science 2024-04-03 Ting-Rui Chiang , Xinyan Velocity Yu , Joshua Robinson , Ollie Liu , Isabelle Lee , Dani Yogatama

The Distributional Hypothesis Does Not Fully Explain the Benefits of Masked Language Model Pretraining

We analyze the masked language modeling pretraining objective function from the perspective of the distributional hypothesis. We investigate whether better sample efficiency and the better generalization capability of models pretrained with…

Computation and Language · Computer Science 2023-10-26 Ting-Rui Chiang , Dani Yogatama

DialCrowd 2.0: A Quality-Focused Dialog System Crowdsourcing Toolkit

Dialog system developers need high-quality data to train, fine-tune and assess their systems. They often use crowdsourcing for this since it provides large quantities of data from many workers. However, the data may not be of sufficiently…

Computation and Language · Computer Science 2022-07-27 Jessica Huynh , Ting-Rui Chiang , Jeffrey Bigham , Maxine Eskenazi

Breaking Down Multilingual Machine Translation

While multilingual training is now an essential ingredient in machine translation (MT) systems, recent work has demonstrated that it has different effects in different multilingual settings, such as many-to-one, one-to-many, and…

Computation and Language · Computer Science 2022-04-06 Ting-Rui Chiang , Yi-Pei Chen , Yi-Ting Yeh , Graham Neubig

Improving Dialogue State Tracking by Joint Slot Modeling

Dialogue state tracking models play an important role in a task-oriented dialogue system. However, most of them model the slot types conditionally independently given the input. We discover that it may cause the model to be confused by slot…

Computation and Language · Computer Science 2021-11-16 Ting-Rui Chiang , Yi-Ting Yeh

Are you doing what I say? On modalities alignment in ALFRED

ALFRED is a recently proposed benchmark that requires a model to complete tasks in simulated house environments specified by instructions in natural language. We hypothesize that key to success is accurately aligning the text modality with…

Computation and Language · Computer Science 2021-10-13 Ting-Rui Chiang , Yi-Ting Yeh , Ta-Chung Chi , Yau-Shian Wang

On a Benefit of Mask Language Modeling: Robustness to Simplicity Bias

Despite the success of pretrained masked language models (MLM), why MLM pretraining is useful is still a qeustion not fully answered. In this work we theoretically and empirically show that MLM pretraining makes models robust to…

Computation and Language · Computer Science 2021-10-12 Ting-Rui Chiang

Relating Neural Text Degeneration to Exposure Bias

This work focuses on relating two mysteries in neural-based text generation: exposure bias, and text degeneration. Despite the long time since exposure bias was mentioned and the numerous studies for its remedy, to our knowledge, its impact…

Computation and Language · Computer Science 2021-09-21 Ting-Rui Chiang , Yun-Nung Chen

Why Can You Lay Off Heads? Investigating How BERT Heads Transfer

The huge size of the widely used BERT family models has led to recent efforts about model distillation. The main goal of distillation is to create a task-agnostic pre-trained model that can be fine-tuned on downstream tasks without…

Computation and Language · Computer Science 2021-06-15 Ting-Rui Chiang , Yun-Nung Chen

An Empirical Study of Content Understanding in Conversational Question Answering

With a lot of work about context-free question answering systems, there is an emerging trend of conversational question answering models in the natural language processing field. Thanks to the recently collected datasets, including QuAC and…

Computation and Language · Computer Science 2019-11-28 Ting-Rui Chiang , Hao-Tong Ye , Yun-Nung Chen

Semantically-Aligned Equation Generation for Solving and Reasoning Math Word Problems

Solving math word problems is a challenging task that requires accurate natural language understanding to bridge natural language texts and math expressions. Motivated by the intuition about how human generates the equations given the…

Computation and Language · Computer Science 2019-06-11 Ting-Rui Chiang , Yun-Nung Chen

Learning Multi-Level Information for Dialogue Response Selection by Highway Recurrent Transformer

With the increasing research interest in dialogue response generation, there is an emerging branch formulating this task as selecting next sentences, where given the partial dialogue contexts, the goal is to determine the most probable next…

Computation and Language · Computer Science 2019-03-22 Ting-Rui Chiang , Chao-Wei Huang , Shang-Yu Su , Yun-Nung Chen

RAP-Net: Recurrent Attention Pooling Networks for Dialogue Response Selection

The response selection has been an emerging research topic due to the growing interest in dialogue modeling, where the goal of the task is to select an appropriate response for continuing dialogues. To further push the end-to-end dialogue…

Computation and Language · Computer Science 2019-03-22 Chao-Wei Huang , Ting-Rui Chiang , Shang-Yu Su , Yun-Nung Chen