Related papers: Structural Anchor Pruning: Training-Free Multi-Vec…

Sculpting the Vector Space: Towards Efficient Multi-Vector Visual Document Retrieval via Prune-then-Merge Framework

Visual Document Retrieval (VDR), which aims to retrieve relevant pages within vast corpora of visually-rich documents, is of significance in current multimodal retrieval applications. The state-of-the-art multi-vector paradigm excels in…

Computation and Language · Computer Science 2026-04-21 Yibo Yan , Mingdong Ou , Yi Cao , Xin Zou , Jiahao Huo , Shuliang Liu , James Kwok , Xuming Hu

CAP: Correlation-Aware Pruning for Highly-Accurate Sparse Vision Models

Driven by significant improvements in architectural design and training pipelines, computer vision has recently experienced dramatic progress in terms of accuracy on classic benchmarks such as ImageNet. These highly-accurate models are…

Computer Vision and Pattern Recognition · Computer Science 2023-06-01 Denis Kuznedelev , Eldar Kurtic , Elias Frantar , Dan Alistarh

DocPruner: A Storage-Efficient Framework for Multi-Vector Visual Document Retrieval via Adaptive Patch-Level Embedding Pruning

Visual Document Retrieval (VDR), the task of retrieving visually-rich document pages using queries that combine visual and textual cues, is crucial for numerous real-world applications. Recent state-of-the-art methods leverage Large…

Computation and Language · Computer Science 2025-09-30 Yibo Yan , Guangwei Xu , Xin Zou , Shuliang Liu , James Kwok , Xuming Hu

AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates

Structured weight pruning is a representative model compression technique of DNNs to reduce the storage and computation requirements and accelerate inference. An automatic hyperparameter determination process is necessary due to the large…

Machine Learning · Computer Science 2019-09-12 Ning Liu , Xiaolong Ma , Zhiyuan Xu , Yanzhi Wang , Jian Tang , Jieping Ye

Structure-Aware Automatic Channel Pruning by Searching with Graph Embedding

Channel pruning is a powerful technique to reduce the computational overhead of deep neural networks, enabling efficient deployment on resource-constrained devices. However, existing pruning methods often rely on local heuristics or…

Artificial Intelligence · Computer Science 2025-06-16 Zifan Liu , Yuan Cao , Yanwei Yu , Heng Qi , Jie Gui

Structural Pruning of Large Vision Language Models: A Comprehensive Study on Pruning Dynamics, Recovery, and Data Efficiency

While Large Vision Language Models (LVLMs) demonstrate impressive capabilities, their substantial computational and memory requirements pose deployment challenges on resource-constrained edge devices. Current parameter reduction techniques…

Computation and Language · Computer Science 2026-04-28 Yiran Huang , Lukas Thede , Massimiliano Mancini , Wenjia Xu , Zeynep Akata

StepVAR: Structure-Texture Guided Pruning for Visual Autoregressive Models

Visual AutoRegressive (VAR) models based on next-scale prediction enable efficient hierarchical generation, yet the inference cost grows quadratically at high resolutions. We observe that the computationally intensive later scales…

Computer Vision and Pattern Recognition · Computer Science 2026-03-03 Keli Liu , Zhendong Wang , Wengang Zhou , Houqiang Li

Structured Pruning for Efficient Visual Place Recognition

Visual Place Recognition (VPR) is fundamental for the global re-localization of robots and devices, enabling them to recognize previously visited locations based on visual inputs. This capability is crucial for maintaining accurate mapping…

Computer Vision and Pattern Recognition · Computer Science 2024-09-13 Oliver Grainge , Michael Milford , Indu Bodala , Sarvapali D. Ramchurn , Shoaib Ehsan

Spectral Complex Autoencoder Pruning: A Fidelity-Guided Criterion for Extreme Structured Channel Compression

We propose Spectral Complex Autoencoder Pruning (SCAP), a reconstruction-based criterion that measures functional redundancy at the level of individual output channels. For each convolutional layer, we construct a complex interaction field…

Computer Vision and Pattern Recognition · Computer Science 2026-01-15 Wei Liu , Xing Deng , Haijian Shao , Yingtao Jiang

ASAP: Attention Sink Anchored Pruning

Vision Transformers (ViTs) face severe computational bottlenecks due to the quadratic complexity of self-attention at high resolutions. Existing token reduction methods rely on local metrics - such as single-layer attention scores - that…

Machine Learning · Computer Science 2026-05-22 Jaehyuk Lee , Hanyoung Kim , Yanggee Kim , Donghun Lee

AutoDFP: Automatic Data-Free Pruning via Channel Similarity Reconstruction

Structured pruning methods are developed to bridge the gap between the massive scale of neural networks and the limited hardware resources. Most current structured pruning methods rely on training datasets to fine-tune the compressed model,…

Machine Learning · Computer Science 2024-03-14 Siqi Li , Jun Chen , Jingyang Xiang , Chengrui Zhu , Yong Liu

C-SWAP: Explainability-Aware Structured Pruning for Efficient Neural Networks Compression

Neural network compression has gained increasing attention in recent years, particularly in computer vision applications, where the need for model reduction is crucial for overcoming deployment constraints. Pruning is a widely used…

Computer Vision and Pattern Recognition · Computer Science 2025-10-22 Baptiste Bauvin , Loïc Baret , Ola Ahmad

AutoPrune: Each Complexity Deserves a Pruning Policy

The established redundancy in visual tokens within large vision-language models allows pruning to effectively reduce their substantial computational demands. Previous methods typically employ heuristic layer-specific pruning strategies…

Computer Vision and Pattern Recognition · Computer Science 2025-10-07 Hanshi Wang , Yuhao Xu , Zekun Xu , Jin Gao , Yufan Liu , Weiming Hu , Ke Wang , Zhipeng Zhang

SERVAL: Surprisingly Effective Zero-Shot Visual Document Retrieval Powered by Large Vision and Language Models

Visual Document Retrieval (VDR) typically operates as text-to-image retrieval using specialized bi-encoders trained to directly embed document images. We revisit a zero-shot generate-and-encode pipeline: a vision-language model first…

Information Retrieval · Computer Science 2025-09-22 Thong Nguyen , Yibin Lei , Jia-Huei Ju , Andrew Yates

Towards Efficient VLMs: Information-Theoretic Driven Compression via Adaptive Structural Pruning

Recent advances in vision-language models (VLMs) have shown remarkable performance across multimodal tasks, yet their ever-growing scale poses severe challenges for deployment and efficiency. Existing compression methods often rely on…

Computer Vision and Pattern Recognition · Computer Science 2025-11-26 Zhaoqi Xu , Yingying Zhang , Jian Li , Jianwei Guo , Qiannan Zhu , Hua Huang

Protective Self-Adaptive Pruning to Better Compress DNNs

Adaptive network pruning approach has recently drawn significant attention due to its excellent capability to identify the importance and redundancy of layers and filters and customize a suitable pruning solution. However, it remains…

Computer Vision and Pattern Recognition · Computer Science 2023-03-22 Liang Li , Pengfei Zhao

(PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork

Large-scale neural networks have demonstrated remarkable performance in different domains like vision and language processing, although at the cost of massive computation resources. As illustrated by compression literature, structural model…

Computer Vision and Pattern Recognition · Computer Science 2026-02-24 Tianjin Huang , Fang Meng , Li Shen , Fan Liu , Yulong Pei , Mykola Pechenizkiy , Shiwei Liu , Tianlong Chen

Spectral-Aligned Pruning for Universal Error-Correcting Code Transformers

Universal channel decoders based on transformers-such as the Foundation Error Correction Code Transformer (FECCT)-achieve competitive decoding performance across diverse code families with a single shared backbone, optionally followed by…

Information Theory · Computer Science 2026-05-11 Sanghyeon Cho , Taewoo Park , Seong-Joon Park , Dae-Young Yun , Hee-Youl Kwak , Sang-Hyo Kim , Yongjune Kim

OSSCAR: One-Shot Structured Pruning in Vision and Language Models with Combinatorial Optimization

Structured pruning is a promising approach for reducing the inference costs of large vision and language models. By removing carefully chosen structures, e.g., neurons or attention heads, the improvements from this approach can be realized…

Computer Vision and Pattern Recognition · Computer Science 2024-03-21 Xiang Meng , Shibal Ibrahim , Kayhan Behdin , Hussein Hazimeh , Natalia Ponomareva , Rahul Mazumder

Anchor Pruning for Object Detection

This paper proposes anchor pruning for object detection in one-stage anchor-based detectors. While pruning techniques are widely used to reduce the computational cost of convolutional neural networks, they tend to focus on optimizing the…

Computer Vision and Pattern Recognition · Computer Science 2022-06-02 Maxim Bonnaerens , Matthias Freiberger , Joni Dambre