Related papers: Histogram-Aware Sorting for Enhanced Word-Aligned …
Bitmap indexes must be compressed to reduce input/output costs and minimize CPU usage. To accelerate logical operations (AND, OR, XOR) over bitmaps, we use techniques based on run-length encoding (RLE), such as Word-Aligned Hybrid (WAH)…
Bitmap indexes are frequently used to index multidimensional data. They rely mostly on sequential input/output. Bitmaps can be compressed to reduce input/output costs and minimize CPU usage. The most efficient compression techniques are…
Bitmap indexes are commonly used in databases and search engines. By exploiting bit-level parallelism, they can significantly accelerate queries. However, they can use much memory, and thus we might prefer compressed bitmap indexes.…
Column-oriented indexes-such as projection or bitmap indexes-are compressed by run-length encoding to reduce storage and increase speed. Sorting the tables improves compression. On realistic data sets, permuting the columns in the right…
Bit arrays, or bitmaps, are used to significantly speed up set operations in several areas, such as data warehousing, information retrieval, and data mining, to cite a few. However, bitmaps usually use a large storage space, thus requiring…
Compressed bitmap indexes are used in databases and search engines. Many bitmap compression techniques have been proposed, almost all relying primarily on run-length encoding (RLE). However, on unsorted data, we can get superior performance…
Sorting database tables before compressing them improves the compression rate. Can we do better than the lexicographical order? For minimizing the number of runs in a run-length encoding compression scheme, the best approaches to…
In this paper, we will present p roposed enhance process of image compression by using RLE algorithm. This proposed yield to decrease the size of compressing image, but the original method used primarily for compressing a binary images…
To minimize data movement, state-of-the-art parallel sorting algorithms use techniques based on sampling and histogramming to partition keys prior to redistribution. Sampling enables partitioning to be done using a representative subset of…
This paper presents a histogram based reversible data hiding (RDH) scheme, which divides image pixels into different cell frequency bands to sort them for data embedding. Data hiding is more efficient in lower cell frequency bands because…
We propose a novel ray reordering technique to accelerate the ray tracing process by encoding and sorting rays prior to traversal. Instead of spatial coordinates, our method encodes rays according to the cuts of the hierarchical…
The normalization of a data cube is the ordering of the attribute values. For large multidimensional arrays where dense and sparse chunks are stored differently, proper normalization can lead to improved storage efficiency. We show that it…
The Run Length Encoding (RLE) compression method is a long standing simple lossless compression scheme which is easy to implement and achieves a good compression on input data which contains repeating consecutive symbols. In its pure form…
Large-alphabet strings are common in scenarios such as information retrieval and natural-language processing. The efficient storage and processing of such strings usually introduces several challenges that are not witnessed in…
The Burrows-Wheeler Transform (BWT) is a string transformation technique widely used in areas such as bioinformatics and file compression. Many applications combine a run-length encoding (RLE) with the BWT in a way which preserves the…
Compressed bitmap indexes are used to speed up simple aggregate queries in databases. Indeed, set operations like intersections, unions and complements can be represented as logical operations (AND,OR,NOT) that are ideally suited for…
In this paper we propose an index key compression scheme based on the notion of distinction bits by proving that the distinction bits of index keys are sufficient information to determine the sorted order of the index keys correctly. While…
Finding desired information from large data set is a difficult problem. Information retrieval is concerned with the structure, analysis, organization, storage, searching, and retrieval of information. Index is the main constituent of an IR…
We present memory-efficient and scalable algorithms for kernel methods used in machine learning. Using hierarchical matrix approximations for the kernel matrix the memory requirements, the number of floating point operations, and the…
In the recent years, heterogeneous machine learning accelerators have become of significant interest in science, engineering and industry. The major processing speed bottlenecks in these platforms come from (a) an electronic data…