Jiaming Yang — Scifaro

Rethinking KV Cache Eviction via a Unified Information-Theoretic Objective

Key-value (KV) caching is essential for large language model inference, yet its memory overhead poses a critical bottleneck for long-context generation. Existing eviction policies predominantly rely on empirical heuristics, lacking a…

Machine Learning · Computer Science 2026-04-30 Jiaming Yang , Chenwei Tang , Liangli Zhen , Jiancheng Lv

Have ASkotch: A Neat Solution for Large-scale Kernel Ridge Regression

Kernel ridge regression (KRR) is a fundamental computational tool, appearing in problems that range from computational chemistry to health analytics, with a particular interest due to its starring role in Gaussian process regression.…

Machine Learning · Computer Science 2026-01-28 Pratik Rathore , Zachary Frangella , Jiaming Yang , Michał Dereziński , Madeleine Udell

Task-Oriented Communications for 3D Scene Representation: Balancing Timeliness and Fidelity

Real-time Three-dimensional (3D) scene representation is a foundational element that supports a broad spectrum of cutting-edge applications, including digital manufacturing, Virtual, Augmented, and Mixed Reality (VR/AR/MR), and the emerging…

Computer Vision and Pattern Recognition · Computer Science 2025-09-23 Xiangmin Xu , Zhen Meng , Kan Chen , Jiaming Yang , Emma Li , Philip G. Zhao , David Flynn

Task-Oriented Edge-Assisted Cross-System Design for Real-Time Human-Robot Interaction in Industrial Metaverse

Real-time human-device interaction in industrial Metaverse faces challenges such as high computational load, limited bandwidth, and strict latency. This paper proposes a task-oriented edge-assisted cross-system framework using digital twins…

Robotics · Computer Science 2025-08-29 Kan Chen , Zhen Meng , Xiangmin Xu , Jiaming Yang , Emma Li , Philip G. Zhao

Randomized Kaczmarz Methods with Beyond-Krylov Convergence

Randomized Kaczmarz methods form a family of linear system solvers which converge by repeatedly projecting their iterates onto randomly sampled equations. While effective in some contexts, such as highly over-determined least squares,…

Numerical Analysis · Mathematics 2025-07-30 Michał Dereziński , Deanna Needell , Elizaveta Rebrova , Jiaming Yang

Precision Neural Network Quantization via Learnable Adaptive Modules

Quantization Aware Training (QAT) is a neural network quantization technique that compresses model size and improves operational efficiency while effectively maintaining model performance. The paradigm of QAT is to introduce fake…

Computer Vision and Pattern Recognition · Computer Science 2025-04-25 Wenqiang Zhou , Zhendong Yu , Xinyu Liu , Jiaming Yang , Rong Xiao , Tao Wang , Chenwei Tang , Jiancheng Lv

Faster Linear Systems and Matrix Norm Approximation via Multi-level Sketched Preconditioning

We present a new class of preconditioned iterative methods for solving linear systems of the form $Ax = b$. Our methods are based on constructing a low-rank Nystr\"om approximation to $A$ using sparse random matrix sketching. This…

Data Structures and Algorithms · Computer Science 2025-04-14 Michał Dereziński , Christopher Musco , Jiaming Yang

Task-Oriented Edge-Assisted Cooperative Data Compression, Communications and Computing for UGV-Enhanced Warehouse Logistics

This paper explores the growing need for task-oriented communications in warehouse logistics, where traditional communication Key Performance Indicators (KPIs)-such as latency, reliability, and throughput-often do not fully meet task…

Networking and Internet Architecture · Computer Science 2024-10-11 Jiaming Yang , Zhen Meng , Xiangmin Xu , Kan Chen , Emma Liying Li , Philip Guodong G. Zhao

Solving Dense Linear Systems Faster Than via Preconditioning

We give a stochastic optimization algorithm that solves a dense $n\times n$ real-valued linear system $Ax=b$, returning $\tilde x$ such that $\|A\tilde x-b\|\leq \epsilon\|b\|$ in time: $$\tilde O((n^2+nk^{\omega-1})\log1/\epsilon),$$ where…

Data Structures and Algorithms · Computer Science 2024-06-10 Michał Dereziński , Jiaming Yang

HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks

As a variant of Graph Neural Networks (GNNs), Unfolded GNNs offer enhanced interpretability and flexibility over traditional designs. Nevertheless, they still suffer from scalability challenges when it comes to the training cost. Although…

Machine Learning · Computer Science 2024-03-28 Yongyi Yang , Jiaming Yang , Wei Hu , Michał Dereziński

CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models

In this paper, we present CharacterGLM, a series of models built upon ChatGLM, with model sizes ranging from 6B to 66B parameters. Our CharacterGLM is designed for generating Character-based Dialogues (CharacterDial), which aims to equip a…

Computation and Language · Computer Science 2023-11-29 Jinfeng Zhou , Zhuang Chen , Dazhen Wan , Bosi Wen , Yi Song , Jifan Yu , Yongkang Huang , Libiao Peng , Jiaming Yang , Xiyao Xiao , Sahand Sabour , Xiaohan Zhang , Wenjing Hou , Yijia Zhang , Yuxiao Dong , Jie Tang , Minlie Huang

Federated Adversarial Learning: A Framework with Convergence Analysis

Federated learning (FL) is a trending training paradigm to utilize decentralized training data. FL allows clients to update model parameters locally for several epochs, then share them to a global model for aggregation. This training…

Machine Learning · Computer Science 2022-08-09 Xiaoxiao Li , Zhao Song , Jiaming Yang

Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models

Overparameterized neural networks generalize well but are expensive to train. Ideally, one would like to reduce their computational cost while retaining their generalization benefits. Sparse model training is a simple and promising approach…

Machine Learning · Computer Science 2022-05-12 Tri Dao , Beidi Chen , Kaizhao Liang , Jiaming Yang , Zhao Song , Atri Rudra , Christopher Ré