Related papers: Hierarchical Transformer for Task Oriented Dialog …

Hierarchical Autoregressive Transformers: Combining Byte- and Word-Level Processing for Robust, Adaptable Language Models

Tokenization is a fundamental step in natural language processing, breaking text into units that computational models can process. While learned subword tokenizers have become the de-facto standard, they present challenges such as large…

Computation and Language · Computer Science 2025-01-22 Pit Neitemeier , Björn Deiseroth , Constantin Eichenberg , Lukas Balles

Hierarchical Sequence to Sequence Voice Conversion with Limited Data

We present a voice conversion solution using recurrent sequence to sequence modeling for DNNs. Our solution takes advantage of recent advances in attention based modeling in the fields of Neural Machine Translation (NMT), Text-to-Speech…

Audio and Speech Processing · Electrical Eng. & Systems 2019-07-19 Praveen Narayanan , Punarjay Chakravarty , Francois Charette , Gint Puskorius

Hierarchical Transformer Network for Utterance-level Emotion Recognition

While there have been significant advances in de-tecting emotions in text, in the field of utter-ance-level emotion recognition (ULER), there are still many problems to be solved. In this paper, we address some challenges in ULER in dialog…

Computation and Language · Computer Science 2020-02-19 QingBiao Li , ChunHua Wu , KangFeng Zheng , Zhe Wang

Augmenting Neural Response Generation with Context-Aware Topical Attention

Sequence-to-Sequence (Seq2Seq) models have witnessed a notable success in generating natural conversational exchanges. Notwithstanding the syntactically well-formed responses generated by these neural network models, they are prone to be…

Computation and Language · Computer Science 2019-06-05 Nouha Dziri , Ehsan Kamalloo , Kory W. Mathewson , Osmar Zaiane

Dialogue Transformers

We introduce a dialogue policy based on a transformer architecture, where the self-attention mechanism operates over the sequence of dialogue turns. Recent work has used hierarchical recurrent neural networks to encode multiple utterances…

Computation and Language · Computer Science 2020-05-04 Vladimir Vlasov , Johannes E. M. Mosig , Alan Nichol

Transformer with Tree-order Encoding for Neural Program Generation

While a considerable amount of semantic parsing approaches have employed RNN architectures for code generation tasks, there have been only few attempts to investigate the applicability of Transformers for this task. Including hierarchical…

Computation and Language · Computer Science 2022-06-28 Klaudia-Doris Thellmann , Bernhard Stadler , Ricardo Usbeck , Jens Lehmann

Hierarchical Transformers Are More Efficient Language Models

Transformer models yield impressive results on many NLP and sequence modeling tasks. Remarkably, Transformers can handle long sequences which allows them to produce long coherent outputs: full paragraphs produced by GPT-3 or well-structured…

Machine Learning · Computer Science 2022-04-19 Piotr Nawrot , Szymon Tworkowski , Michał Tyrolski , Łukasz Kaiser , Yuhuai Wu , Christian Szegedy , Henryk Michalewski

On Task-Level Dialogue Composition of Generative Transformer Model

Task-oriented dialogue systems help users accomplish tasks such as booking a movie ticket and ordering food via conversation. Generative models parameterized by a deep neural network are widely used for next turn response generation in such…

Computation and Language · Computer Science 2020-10-13 Prasanna Parthasarathi , Arvind Neelakantan , Sharan Narang

User Modeling for Task Oriented Dialogues

We introduce end-to-end neural network based models for simulating users of task-oriented dialogue systems. User simulation in dialogue systems is crucial from two different perspectives: (i) automatic evaluation of different dialogue…

Computation and Language · Computer Science 2018-11-13 Izzeddin Gur , Dilek Hakkani-Tur , Gokhan Tur , Pararth Shah

A Hierarchical Transformer for Unsupervised Parsing

The underlying structure of natural language is hierarchical; words combine into phrases, which in turn form clauses. An awareness of this hierarchical structure can aid machine learning models in performing many linguistic tasks. However,…

Machine Learning · Computer Science 2020-04-01 Ashok Thillaisundaram

Transformer-based Models of Text Normalization for Speech Applications

Text normalization, or the process of transforming text into a consistent, canonical form, is crucial for speech applications such as text-to-speech synthesis (TTS). In TTS, the system must decide whether to verbalize "1995" as "nineteen…

Machine Learning · Computer Science 2022-02-02 Jae Hun Ro , Felix Stahlberg , Ke Wu , Shankar Kumar

Injecting Hierarchy with U-Net Transformers

The Transformer architecture has become increasingly popular over the past two years, owing to its impressive performance on a number of natural language processing (NLP) tasks. However, all Transformer computations occur at the level of…

Machine Learning · Computer Science 2021-04-05 David Donahue , Vladislav Lialin , Anna Rumshisky

Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-based Question Answering

We introduce a novel approach to transformers that learns hierarchical representations in multiparty dialogue. First, three language modeling tasks are used to pre-train the transformers, token- and utterance-level language modeling and…

Computation and Language · Computer Science 2020-06-01 Changmao Li , Jinho D. Choi

Hierarchical Reasoning Models: Perspectives and Misconceptions

Transformers have demonstrated remarkable performance in natural language processing and related domains, as they largely focus on sequential, autoregressive next-token prediction tasks. Yet, they struggle in logical reasoning, not…

Artificial Intelligence · Computer Science 2025-10-08 Renee Ge , Qianli Liao , Tomaso Poggio

Hierarchical Resolution Transformers: A Wavelet-Inspired Architecture for Multi-Scale Language Understanding

Transformer architectures have achieved state-of-the-art performance across natural language tasks, yet they fundamentally misrepresent the hierarchical nature of human language by processing text as flat token sequences. This results in…

Computation and Language · Computer Science 2025-09-26 Ayan Sar , Sampurna Roy , Kanav Gupta , Anurag Kaushish , Tanupriya Choudhury , Abhijit Kumar

Hierarchical Learning for Generation with Long Source Sequences

One of the challenges for current sequence to sequence (seq2seq) models is processing long sequences, such as those in summarization and document level machine translation tasks. These tasks require the model to reason at the token level as…

Computation and Language · Computer Science 2021-09-20 Tobias Rohde , Xiaoxia Wu , Yinhan Liu

Do Encoder Representations of Generative Dialogue Models Encode Sufficient Information about the Task ?

Predicting the next utterance in dialogue is contingent on encoding of users' input text to generate appropriate and relevant response in data-driven approaches. Although the semantic and syntactic quality of the language generated is…

Computation and Language · Computer Science 2021-06-22 Prasanna Parthasarathi , Joelle Pineau , Sarath Chandar

Hierarchical Pre-training for Sequence Labelling in Spoken Dialog

Sequence labelling tasks like Dialog Act and Emotion/Sentiment identification are a key component of spoken dialog systems. In this work, we propose a new approach to learn generic representations adapted to spoken dialog, which we evaluate…

Computation and Language · Computer Science 2021-02-09 Emile Chapuis , Pierre Colombo , Matteo Manica , Matthieu Labeau , Chloe Clavel

Hierarchical Attention Transformer Architecture For Syntactic Spell Correction

The attention mechanisms are playing a boosting role in advancements in sequence-to-sequence problems. Transformer architecture achieved new state of the art results in machine translation, and it's variants are since being introduced in…

Machine Learning · Computer Science 2020-05-12 Abhishek Niranjan , M Ali Basha Shaik , Kushal Verma

Context-Aware Sequence-to-Sequence Models for Conversational Systems

This work proposes a novel approach based on sequence-to-sequence (seq2seq) models for context-aware conversational systems. Exist- ing seq2seq models have been shown to be good for generating natural responses in a data-driven…

Computation and Language · Computer Science 2018-05-23 Silje Christensen , Simen Johnsrud , Massimiliano Ruocco , Heri Ramampiaro