English
Related papers

Related papers: Stream VByte: Faster Byte-Oriented Integer Compres…

200 papers

We consider the ubiquitous technique of VByte compression, which represents each integer as a variable length sequence of bytes. The low 7 bits of each byte encode a portion of the integer, and the high bit of each byte is reserved as a…

Information Retrieval · Computer Science 2017-01-17 Jeff Plaisance , Nathan Kurz , Daniel Lemire

In many important applications -- such as search engines and relational database systems -- data is stored in the form of arrays of integers. Encoding and, most importantly, decoding of these arrays consumes considerable CPU time.…

Information Retrieval · Computer Science 2021-02-02 Daniel Lemire , Leonid Boytsov

The ubiquitous Variable-Byte encoding is one of the fastest compressed representation for integer sequences. However, its compression ratio is usually not competitive with other more sophisticated encoders, especially when the integers to…

Information Retrieval · Computer Science 2022-02-08 Giulio Ermanno Pibiri , Rossano Venturini

Compression algorithms are important for data oriented tasks, especially in the era of Big Data. Modern processors equipped with powerful SIMD instruction sets, provide us an opportunity for achieving better compression performance.…

Information Retrieval · Computer Science 2015-04-15 Wayne Xin Zhao , Xudong Zhang , Daniel Lemire , Dongdong Shan , Jian-Yun Nie , Hongfei Yan , Ji-Rong Wen

Text retrieval using learned sparse representations of queries and documents has, over the years, evolved into a highly effective approach to search. It is thanks to recent advances in approximate nearest neighbor search-with the emergence…

Information Retrieval · Computer Science 2026-02-06 Sebastian Bruch , Martino Fontana , Franco Maria Nardini , Cosimo Rulli , Rossano Venturini

Compression can sometimes improve performance by making more of the data available to the processors faster. We consider the compression of integer keys in a B+-tree index. For this purpose, systems such as IBM DB2 use variable-byte…

Databases · Computer Science 2017-01-18 Daniel Lemire , Christoph Rupp

Sorted lists of integers are commonly used in inverted indexes and database systems. They are often compressed in memory. We can use the SIMD instructions available in common processors to boost the speed of integer compression schemes. Our…

Information Retrieval · Computer Science 2020-04-22 Daniel Lemire , Leonid Boytsov , Nathan Kurz

Visual sensors serve as a critical component of the Internet of Things (IoT). There is an ever-increasing demand for broad applications and higher resolutions of videos and cameras in smart homes and smart cities, such as in security…

Image and Video Processing · Electrical Eng. & Systems 2021-03-30 Amir Fotovvat , Khan A. Wahid

Modern processors have instructions to process 16 bytes or more at once. These instructions are called SIMD, for single instruction, multiple data. Recent advances have leveraged SIMD instructions to accelerate parsing of common Internet…

Data Structures and Algorithms · Computer Science 2025-06-05 Daniel Lemire

The number of IoT devices is expected to continue its dramatic growth in the coming years and, with it, a growth in the amount of data to be transmitted, processed and stored. Compression techniques that support analytics directly on the…

Data Structures and Algorithms · Computer Science 2023-08-29 Francesco Taurone , Daniel E. Lucani , Marcell Fehér , Qi Zhang

The ubiquity of variable-length integers in data storage and communication necessitates efficient decoding techniques. In this paper, we present SFVInt, a simple and fast approach to decode the prevalent Little Endian Base-128 (LEB128)…

Databases · Computer Science 2024-06-10 Gang Liao , Ye Liu , Yonghua Ding , Le Cai , Jianjun Chen

Data is compressed by reducing its redundancy, but this also makes the data less reliable, more prone to errors. In this paper a novel approach of image compression based on a new method that has been created for image compression which is…

Computer Vision and Pattern Recognition · Computer Science 2012-11-21 Firas A. Jassim , Hind E. Qassim

The majority of online content is written in languages other than English, and is most commonly encoded in UTF-8, the world's dominant Unicode character encoding. Traditional compression algorithms typically operate on individual bytes.…

Information Theory · Computer Science 2017-01-17 Adam Gleave , Christian Steinruecken

In the burgeoning realm of Internet of Things (IoT) applications on edge devices, data stream compression has become increasingly pertinent. The integration of added compression overhead and limited hardware resources on these devices calls…

Databases · Computer Science 2024-06-18 Xianzhi Zeng , Shuhao Zhang

Counting the number of ones in a binary stream is a common operation in database, information-retrieval, cryptographic and machine-learning applications. Most processors have dedicated instructions to count the number of ones in a word…

Data Structures and Algorithms · Computer Science 2018-09-07 Wojciech Muła , Nathan Kurz , Daniel Lemire

It was estimated that the world produced $59 ZB$ ($5.9 \times 10^{13} GB$) of data in 2020, resulting in the enormous costs of both data storage and transmission. Fortunately, recent advances in deep generative models have spearheaded a new…

Machine Learning · Computer Science 2021-11-02 Shifeng Zhang , Ning Kang , Tom Ryder , Zhenguo Li

Learning-based 3D visual geometry models have benefited substantially from large-scale transformers. Among these, StreamVGGT leverages frame-wise causal attention for strong streaming reconstruction, but suffers from unbounded KV cache…

Computer Vision and Pattern Recognition · Computer Science 2026-01-06 Zunhai Su , Weihao Ye , Hansen Feng , Keyu Fan , Jing Zhang , Dahai Yu , Zhengwu Liu , Ngai Wong

Compression algorithms reduce the redundancy in data representation to decrease the storage required for that data. Data compression offers an attractive approach to reducing communication costs by using available bandwidth effectively.…

Performance · Computer Science 2007-05-23 B. S. Shajeemohan , Dr. V. K. Govindan

By applying entropy codecs with learned data distributions, neural compressors have significantly outperformed traditional codecs in terms of compression ratio. However, the high inference latency of neural networks hinders the deployment…

Machine Learning · Computer Science 2022-06-20 Siyu Wang , Jianfei Chen , Chongxuan Li , Jun Zhu , Bo Zhang

In this paper, we will propose a new synchronous stream cipher named DICING, which can be viewed as a clock-controlled one but with a new mechanism of altering steps. It has satisfactory performance and there have not been found weakness…

Cryptography and Security · Computer Science 2009-03-21 An-Ping Li
‹ Prev 1 2 3 10 Next ›