Related papers: Higher Order Context Transformations
This paper proposes a novel entropy encoding technique for lossless data compression. Representing a message string by its lexicographic index in the permutations of its symbols results in a compressed version matching Shannon entropy of…
Combining the representations of the words that make up a sentence into a cohesive whole is difficult, since it needs to account for the order of words, and to establish how the words present relate to each other. The solution we propose…
We propose a method to improve traditional character-based PPM text compression algorithms. Consider a text file as a sequence of alternating words and non-words, the basic idea of our algorithm is to encode non-words and prefixes of words…
We propose a context-adaptive entropy model for use in end-to-end optimized image compression. Our model exploits two types of contexts, bit-consuming contexts and bit-free contexts, distinguished based upon whether additional bit…
Generalizing to longer sentences is important for recent Transformer-based language models. Besides algorithms manipulating explicit position features, the success of Transformers without position encodings (NoPE) provides a new way to…
While data augmentation is an important trick to boost the accuracy of deep learning methods in computer vision tasks, its study in natural language tasks is still very limited. In this paper, we present a novel data augmentation method for…
Most approaches to long-context processing increase the complexity of the transformer's internal architecture by integrating mechanisms such as recurrence or auxiliary memory modules. In this work, we introduce an alternative approach that…
**Context** Context-free grammars are widely used for language prototyping and implementation. They allow formalizing the syntax of domain-specific or general-purpose programming languages concisely and declaratively. However, the natural…
Interest in larger-context neural machine translation, including document-level and multi-modal translation, has been growing. Multiple works have proposed new network architectures or evaluation schemes, but potentially helpful context is…
Large language models have shown remarkable performance across a wide range of language tasks, owing to their exceptional capabilities in context modeling. The most commonly used method of context modeling is full self-attention, as seen in…
Semantic maps are increasingly utilized in areas such as robotics, autonomous systems, and extended reality, motivating the investigation of efficient compression methods that preserve structured semantic information. This paper studies…
Large language models (LLMs) have shown remarkable performance in various natural language processing tasks. However, a primary constraint they face is the context limit, i.e., the maximum number of tokens they can process. Previous works…
Context-aware neural machine translation aims to use the document-level context to improve translation quality. However, not all words in the context are helpful. The irrelevant or trivial words may bring some noise and distract the model…
Long-context modeling presents a significant challenge for transformer-based large language models (LLMs) due to the quadratic complexity of the self-attention mechanism and issues with length extrapolation caused by pretraining exclusively…
This thesis concerns sequential-access data compression, i.e., by algorithms that read the input one or more times from beginning to end. In one chapter we consider adaptive prefix coding, for which we must read the input character by…
Normalization is a pre-processing step that converts the data into a more usable representation. As part of the deep neural networks (DNNs), the batch normalization (BN) technique uses normalization to address the problem of internal…
Large Language Models (LLMs) based on Transformers excel at text processing, but their reliance on prompts for specialized behavior introduces computational overhead. We propose a modification to a Transformer architecture that eliminates…
Conditional entropy models effectively leverage spatio-temporal contexts to reduce video redundancy. However, incorporating temporal context often introduces additional model complexity and increases computational cost. In parallel, many…
Transformer-based Large Language Models (LLMs) often impose limitations on the length of the text input to ensure the generation of fluent and relevant responses. This constraint restricts their applicability in scenarios involving long…
This paper studies context bisimulation for higher-order processes, in the presence of parameterization (viz. abstraction). We show that the extension of higher-order processes with process parameterization retains the characterization of…