Related papers: An efficient coding algorithm for general Framed P…
Following the trends of index modulated (IM) techniques for optical communications, in the last few years several new waveform proposals have been made, aiming at conveying a higher density of information by driving different signal…
Due to the speed limitation of the conventional bit-chosen strategy in the existing weighted bit flipping algorithms, a high-speed LDPC decoder cannot be realized. To solve this problem, we propose a fast weighted bit flipping (FWBF)…
Frame permutation quantization (FPQ) is a new vector quantization technique using finite frames. In FPQ, a vector is encoded using a permutation source code to quantize its frame expansion. This means that the encoding is a partial ordering…
This work proposes a mixed learning-based and optimization-based approach to the weighted-sum-rates beamforming problem in a multiple-input multiple-output (MIMO) wireless network. The conventional methods, i.e., the fractional programming…
State-of-the-art generic low-precision training algorithms use a mix of 16-bit and 32-bit precision, creating the folklore that 16-bit hardware compute units alone are not enough to maximize model accuracy. As a result, deep learning…
We propose a novel scheme that allows MIMO system to modulate a set of permutation matrices to send more information bits, extending our initial work on the topic. This system is called Permutation Matrix Modulation (PMM). The basic idea is…
Neural network quantization is widely used to reduce model inference complexity in real-world deployments. However, traditional integer quantization suffers from accuracy degradation when adapting to various dynamic ranges. Recent research…
In spite of the great potential of large language models (LLMs) across various tasks, their deployment on resource-constrained devices remains challenging due to their excessive computational and memory demands. Quantization has emerged as…
In this paper, we explore FP8 low-bit data formats for efficient training of large language models (LLMs). Our key insight is that most variables, such as gradients and optimizer states, in LLM training can employ low-precision data formats…
In this paper, a novel fine timing algorithm has been tested and developed to synchronize Ultra-Wideband (UWB) signals with pulse position modulation (PPM). By applying this algorithm, we evaluate timing algorithms in both data-aided (DA)…
Selective Harmonic Elimination Pulse Width Modulation (SHEPWM) is an important technique to solve PWM problems, which control the output voltage of an inverter via selecting appropriate switching angles. Based on the Rational Univariate…
Weight-only quantization has emerged as a promising solution to the deployment challenges of large language models (LLMs). However, it necessitates FP-INT operations, which make implementation on general-purpose hardware like GPUs…
A new carrier-based pulse-width modulation (PWM) technique to control power inverters is presented in this paper. To generate the output waveform, this technique compares a harmonic-injection modulating wave and a frequency-modulated…
In the wideband regime, the performance of many of the popular modulation schemes such as code division multiple access and orthogonal frequency division multiplexing falls quickly without channel state information. Obtaining the amount of…
Efficient and realistic error decoding is crucial for fault-tolerant quantum computation (FTQC) on near-term devices. While decoding is a classical post-processing task, its effectiveness depends on accurately modeling quantum noise, which…
In this paper, probabilistic shaping is numerically and experimentally investigated for increasing the transmission reach of wavelength division multiplexed (WDM) optical communication system employing quadrature amplitude modulation (QAM).…
We demonstrate, for the first time, fully quantized training (FQT) of large language models (LLMs) using predominantly 4-bit floating-point (FP4) precision for weights, activations, and gradients on datasets up to 200 billion tokens. We…
The deployment of large language models (LLMs) is often constrained by memory bandwidth, where the primary bottleneck is the cost of transferring model parameters from the GPU's global memory to its registers. When coupled with custom…
We propose LLM-FP4 for quantizing both weights and activations in large language models (LLMs) down to 4-bit floating-point values, in a post-training manner. Existing post-training quantization (PTQ) solutions are primarily integer-based…
We propose a quantized decoding algorithm for low- density parity-check codes where the variable node update rule of the standard min-sum algorithm is replaced with a look-up table (LUT) that is designed using an information-theoretic…