Related papers: Word-Based Text Compression

Making compression algorithms for Unicode text

The majority of online content is written in languages other than English, and is most commonly encoded in UTF-8, the world's dominant Unicode character encoding. Traditional compression algorithms typically operate on individual bytes.…

Information Theory · Computer Science 2017-01-17 Adam Gleave , Christian Steinruecken

A study for Image compression using Re-Pair algorithm

The compression is an important topic in computer science which allows we to storage more amount of data on our data storage. There are several techniques to compress any file. In this manuscript will be described the most important…

Multimedia · Computer Science 2019-02-14 Pasquale De Luca , Vincenzo Maria Russiello , Raffaele Ciro Sannino , Lorenzo Valente

Improving PPM Algorithm Using Dictionaries

We propose a method to improve traditional character-based PPM text compression algorithms. Consider a text file as a sequence of alternating words and non-words, the basic idea of our algorithm is to encode non-words and prefixes of words…

Information Theory · Computer Science 2015-03-17 Yichuan Hu , Jianzhong , Zhang , Farooq Khan , Ying Li

AlphaZip: Neural Network-Enhanced Lossless Text Compression

Data compression continues to evolve, with traditional information theory methods being widely used for compressing text, images, and videos. Recently, there has been growing interest in leveraging Generative AI for predictive compression…

Information Theory · Computer Science 2024-09-24 Swathi Shree Narashiman , Nitin Chandrachoodan

Compression Algorithm Based on Irregular Sequence

The paper introduces a new lossless, highly robust compression algorithm that similar with LZW algorithm, yet the algorithm discards dictionary processing and uses irregular sequences with massive, random information instead. Then the paper…

Signal Processing · Electrical Eng. & Systems 2020-06-24 Rui Zhu

IDBE - An Intelligent Dictionary Based Encoding Algorithm for Text Data Compression for High Speed Data Transmission Over Internet

Compression algorithms reduce the redundancy in data representation to decrease the storage required for that data. Data compression offers an attractive approach to reducing communication costs by using available bandwidth effectively.…

Information Theory · Computer Science 2007-07-13 B. S. Shajee Mohan , V. K. Govindan

A Novel Approach to Compress Centralized Text Data using Indexed Dictionary

Data compression is very important feature in terms of saving the memory space. In this proposal, an indexed dictionary based compression is used for text data, where the word's reference in dictionary is used for compression. This approach…

Other Computer Science · Computer Science 2015-12-23 Vivek Dimri , Prof. Ranjit Biswas

Bit-Optimal Lempel-Ziv compression

One of the most famous and investigated lossless data-compression scheme is the one introduced by Lempel and Ziv about 40 years ago. This compression scheme is known as "dictionary-based compression" and consists of squeezing an input…

Data Structures and Algorithms · Computer Science 2008-02-07 Paolo Ferragina , Igor Nitto , Rossano Venturini

Note on the Greedy Parsing Optimality for Dictionary-Based Text Compression

Dynamic dictionary-based compression schemes are the most daily used data compression schemes since they appeared in the foundational papers of Ziv and Lempel in 1977, commonly referred to as LZ77. Their work is the base of Deflate, gZip,…

Data Structures and Algorithms · Computer Science 2012-11-26 Maxime Crochemore , Alessio Langiu , Filippo Mignosi

Crossword: A Semantic Approach to Data Compression via Masking

The traditional methods for data compression are typically based on the symbol-level statistics, with the information source modeled as a long sequence of i.i.d. random variables or a stochastic process, thus establishing the fundamental…

Computation and Language · Computer Science 2023-04-04 Mingxiao Li , Rui Jin , Liyao Xiang , Kaiming Shen , Shuguang Cui

Compressing the Data Densely by New Geflochtener to Accelerate Web

At the present scenario of the internet, there exist many optimization techniques to improve the Web speed but almost expensive in terms of bandwidth. So after a long investigation on different techniques to compress the data without any…

Information Theory · Computer Science 2014-05-20 Hemant Kumar Saini , Satpal Singh Kushwaha , C. Rama Krishna

Enhancing Dictionary Based Preprocessing For Better Text Compression

With the rapid growing of data and number of applications, there is a crucial need of dictionary based reversible transformation techniques to increase the efficiency of the compression algorithms and hence contribute towards the…

Information Theory · Computer Science 2014-03-20 R. R. Baruah , V. Deka , M. P. Bhuyan

Bidirectional Text Compression in External Memory

Bidirectional compression algorithms work by substituting repeated substrings by references that, unlike in the famous LZ77-scheme, can point to either direction. We present such an algorithm that is particularly suited for an external…

Data Structures and Algorithms · Computer Science 2019-12-04 Patrick Dinklage , Jonas Ellert , Johannes Fischer , Dominik Köppl , Manuel Penschuck

Practical Random Access to SLP-Compressed Texts

Grammar-based compression is a popular and powerful approach to compressing repetitive texts but until recently its relatively poor time-space trade-offs during real-life construction made it impractical for truly massive datasets such as…

Data Structures and Algorithms · Computer Science 2020-07-21 Travis Gagie , Tomohiro I , Giovanni Manzini , Gonzalo Navarro , Hiroshi Sakamoto , Louisa Seelbach Benkner , Yoshimasa Takabatake

Restructuring Compressed Texts without Explicit Decompression

We consider the problem of {\em restructuring} compressed texts without explicit decompression. We present algorithms which allow conversions from compressed representations of a string $T$ produced by any grammar-based compression…

Data Structures and Algorithms · Computer Science 2011-07-15 Keisuke Goto , Shirou Maruyama , Shunsuke Inenaga , Hideo Bannai , Hiroshi Sakamoto , Masayuki Takeda

An Optimized Huffmans Coding by the method of Grouping

Data compression has become a necessity not only the in the field of communication but also in various scientific experiments. The data that is being received is more and the processing time required has also become more. A significant…

Information Theory · Computer Science 2016-07-29 Gautam R , S Murali

LZ-Compressed String Dictionaries

We show how to compress string dictionaries using the Lempel-Ziv (LZ78) data compression algorithm. Our approach is validated experimentally on dictionaries of up to 1.5 GB of uncompressed text. We achieve compression ratios often…

Data Structures and Algorithms · Computer Science 2013-05-06 Julian Arz , Johannes Fischer

An Efficient Technique for Text Compression

For storing a word or the whole text segment, we need a huge storage space. Typically a character requires 1 Byte for storing it in memory. Compression of the memory is very important for data management. In case of memory requirement…

Information Theory · Computer Science 2010-09-28 Md. Abul Kalam Azad , Rezwana Sharmeen , Shabbir Ahmad , S. M. Kamruzzaman

Applying Compression to a Game's Network Protocol

This report presents the results of applying different compression algorithms to the network protocol of an online game. The algorithm implementations compared are zlib, liblzma and my own implementation based on LZ77 and a variation of…

Information Theory · Computer Science 2012-06-13 Mikael Hirki

Relative Lempel-Ziv Factorization for Efficient Storage and Retrieval of Web Collections

Compression techniques that support fast random access are a core component of any information system. Current state-of-the-art methods group documents into fixed-sized blocks and compress each block with a general-purpose adaptive…

Data Structures and Algorithms · Computer Science 2015-03-19 Christopher Hoobin , Simon J. Puglisi , Justin Zobel