Related papers: Difference-Huffman Coding of Multidimensional Data…

Difference Sequence Compression of Multidimensional Databases

The multidimensional databases often use compression techniques in order to decrease the size of the database. This paper introduces a new method called difference sequence compression. Under some conditions, this new technique is able to…

Databases · Computer Science 2011-04-28 István Szépkúti

Caching in Multidimensional Databases

One utilisation of multidimensional databases is the field of On-line Analytical Processing (OLAP). The applications in this area are designed to make the analysis of shared multidimensional information fast [9]. On one hand, speed can be…

Databases · Computer Science 2011-05-04 István Szépkúti

Deep Collaborative Discrete Hashing with Semantic-Invariant Structure

Existing deep hashing approaches fail to fully explore semantic correlations and neglect the effect of linguistic context on visual attention learning, leading to inferior performance. This paper proposes a dual-stream learning framework,…

Information Retrieval · Computer Science 2019-11-06 Zijian Wang , Zheng Zhang , Yadan Luo , Zi Huang

Improved Search in Hamming Space using Deep Multi-Index Hashing

Similarity-preserving hashing is a widely-used method for nearest neighbour search in large-scale image retrieval tasks. There has been considerable research on generating efficient image representation via the deep-network-based hashing…

Computer Vision and Pattern Recognition · Computer Science 2017-10-20 Hanjiang Lai , Yan Pan

Dual Purpose Hashing

Recent years have seen more and more demand for a unified framework to address multiple realistic image retrieval tasks concerning both category and attributes. Considering the scale of modern datasets, hashing is favorable for its low…

Computer Vision and Pattern Recognition · Computer Science 2016-07-20 Haomiao Liu , Ruiping Wang , Shiguang Shan , Xilin Chen

Multiple Code Hashing for Efficient Image Retrieval

Due to its low storage cost and fast query speed, hashing has been widely used in large-scale image retrieval tasks. Hash bucket search returns data points within a given Hamming radius to each query, which can enable search at a constant…

Machine Learning · Computer Science 2024-05-07 Ming-Wei Li , Qing-Yuan Jiang , Wu-Jun Li

Bilateral Distribution Compression: Reducing Both Data Size and Dimensionality

Existing distribution compression methods reduce the number of observations in a dataset by minimising the Maximum Mean Discrepancy (MMD) between original and compressed sets, but modern datasets are often large in both sample size and…

Machine Learning · Statistics 2026-01-28 Dominic Broadbent , Nick Whiteley , Robert Allison , Tom Lovett

Benefiting from Duplicates of Compressed Data: Shift-Based Holographic Compression of Images

Storage systems often rely on multiple copies of the same compressed data, enabling recovery in case of binary data errors, of course, at the expense of a higher storage cost. In this paper we show that a wiser method of duplication entails…

Multimedia · Computer Science 2019-02-08 Yehuda Dar , Alfred M. Bruckstein

Huff-DP: Huffman Coding based Differential Privacy Mechanism for Real-Time Data

With the advancements in connected devices, a huge amount of real-time data is being generated. Efficient storage, transmission, and analysation of this real-time big data is important, as it serves a number of purposes ranging from…

Cryptography and Security · Computer Science 2023-01-26 Muneeb Ul Hassan , Mubashir Husain Rehmani , Jinjun Chen

Accelerating Lossless Data Compression with GPUs

Huffman compression is a statistical, lossless, data compression algorithm that compresses data by assigning variable length codes to symbols, with the more frequently appearing symbols given shorter codes than the less. This work is a…

Information Theory · Computer Science 2011-07-11 R. L. Cloud , M. L. Curry , H. L. Ward , A. Skjellum , P. Bangalore

Separate Chaining Meets Compact Hashing

While separate chaining is a common strategy for resolving collisions in a hash table taught in most textbooks, compact hashing is a less common technique for saving space when hashing integers whose domain is relatively small with respect…

Data Structures and Algorithms · Computer Science 2019-05-02 Dominik Köppl

Homomorphic Parameter Compression for Distributed Deep Learning Training

Distributed training of deep neural networks has received significant research interest, and its major approaches include implementations on multiple GPUs and clusters. Parallelization can dramatically improve the efficiency of training…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-11-29 Jaehee Jang , Byungook Na , Sungroh Yoon

Deep Cross-Modal Hashing

Due to its low storage cost and fast query speed, cross-modal hashing (CMH) has been widely used for similarity search in multimedia retrieval applications. However, almost all existing CMH methods are based on hand-crafted features which…

Information Retrieval · Computer Science 2016-02-16 Qing-Yuan Jiang , Wu-Jun Li

Implicit Neural Multiple Description for DNA-based data storage

DNA exhibits remarkable potential as a data storage solution due to its impressive storage density and long-term stability, stemming from its inherent biomolecular structure. However, developing this novel medium comes with its own set of…

Image and Video Processing · Electrical Eng. & Systems 2023-09-14 Trung Hieu Le , Xavier Pic , Jeremy Mateos , Marc Antonini

Multimodal diff-hash

Many applications require comparing multimodal data with different structure and dimensionality that cannot be compared directly. Recently, there has been increasing interest in methods for learning and efficiently representing such…

Computer Vision and Pattern Recognition · Computer Science 2011-11-08 Michael M. Bronstein

Embedding Feature Selection for Large-scale Hierarchical Classification

Large-scale Hierarchical Classification (HC) involves datasets consisting of thousands of classes and millions of training instances with high-dimensional features posing several big data challenges. Feature selection that aims to select…

Machine Learning · Computer Science 2017-06-07 Azad Naik , Huzefa Rangwala

MTFH: A Matrix Tri-Factorization Hashing Framework for Efficient Cross-Modal Retrieval

Hashing has recently sparked a great revolution in cross-modal retrieval because of its low storage cost and high query speed. Recent cross-modal hashing methods often learn unified or equal-length hash codes to represent the multi-modal…

Computer Vision and Pattern Recognition · Computer Science 2019-09-13 Xin Liu , Zhikai Hu , Haibin Ling , Yiu-ming Cheung

Learning to Collide: Recommendation System Model Compression with Learned Hash Functions

A key characteristic of deep recommendation models is the immense memory requirements of their embedding tables. These embedding tables can often reach hundreds of gigabytes which increases hardware requirements and training cost. A common…

Information Retrieval · Computer Science 2022-03-31 Benjamin Ghaemmaghami , Mustafa Ozdal , Rakesh Komuravelli , Dmitriy Korchev , Dheevatsa Mudigere , Krishnakumar Nair , Maxim Naumov

Compression Aware Physical Database Design

Modern RDBMSs support the ability to compress data using methods such as null suppression and dictionary encoding. Data compression offers the promise of significantly reducing storage requirements and improving I/O performance for decision…

Databases · Computer Science 2011-09-06 Hideaki Kimura , Vivek Narasayya , Manoj Syamala

Application-Level Differential Checkpointing for HPC Applications with Dynamic Datasets

High-performance computing (HPC) requires resilience techniques such as checkpointing in order to tolerate failures in supercomputers. As the number of nodes and memory in supercomputers keeps on increasing, the size of checkpoint data also…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-13 Kai Keller , Leonardo Bautista Gomez