Computer Vision and Pattern Recognition · Computer Science
A Survey of Token Compression for Efficient Multimodal Large Language Models
Kele Shao, Keda Tao, Kejia Zhang, Sicheng Feng +6
2026-02-03
Computer Vision and Pattern Recognition · Computer Science
Towards Lossless Ultimate Vision Token Compression for VLMs
Dehua Zheng, Mouxiao Huang, Borui Jiang, Hailin Hu +1
2025-12-11
Computer Vision and Pattern Recognition · Computer Science
UniCompress: Token Compression for Unified Vision-Language Understanding and Generation
Ziyao Wang, Chen Chen, Jingtao Li, Weiming Zhuang +3
2026-03-13
Computer Vision and Pattern Recognition · Computer Science
VisionSelector: End-to-End Learnable Visual Token Compression for Efficient Multimodal LLMs
Jiaying Zhu, Yurui Zhu, Xin Lu, Wenrui Yan +4
2025-10-21
Computer Vision and Pattern Recognition · Computer Science
EvoComp: Learning Visual Token Compression for Multimodal Large Language Models via Semantic-Guided Evolutionary Labeling
Jiafei Song, Fengwei Zhou, Jin Qu, Wenjin Jason Li +6
2026-04-21
Computer Vision and Pattern Recognition · Computer Science
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models
Yuzhang Shang, Mu Cai, Bingxin Xu, Yong Jae Lee +1
2026-02-03
Computer Vision and Pattern Recognition · Computer Science
TrimTokenator: Towards Adaptive Visual Token Pruning for Large Multimodal Models
Hao Zhang, Mengsi Lyu, Chenrui He, Yulong Ao +1
2025-10-03
Computer Vision and Pattern Recognition · Computer Science
FocusLLaVA: A Coarse-to-Fine Approach for Efficient and Effective Visual Token Compression
Yuke Zhu, Chi Xie, Shuang Liang, Bo Zheng +1
2024-11-22
Computer Vision and Pattern Recognition · Computer Science
Treat Visual Tokens as Text? But Your MLLM Only Needs Fewer Efforts to See
Zeliang Zhang, Phu Pham, Wentian Zhao, Kun Wan +5
2024-12-03
Computer Vision and Pattern Recognition · Computer Science
Can Visual Input Be Compressed? A Visual Token Compression Benchmark for Large Multimodal Models
Tianfan Peng, Yuntao Du, Pengzhou Ji, Shijie Dong +8
2025-11-18
Computer Vision and Pattern Recognition · Computer Science
FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models
Kaitong Cai, Jusheng Zhang, Jing Yang, Yijia Fan +3
2025-12-24
Computer Vision and Pattern Recognition · Computer Science
Efficient Large Multi-modal Models via Visual Context Compression
Jieneng Chen, Luoxin Ye, Ju He, Zhao-Yang Wang +2
2024-11-19
Computer Vision and Pattern Recognition · Computer Science
Variation-aware Vision Token Dropping for Faster Large Vision-Language Models
Junjie Chen, Xuyang Liu, Zichen Wen, Yiyu Wang +2
2026-02-26
Computer Vision and Pattern Recognition · Computer Science
Magic-MM-Embedding: Towards Visual-Token-Efficient Universal Multimodal Embedding with MLLMs
Qi Li, Yanzhe Zhao, Yongxin Zhou, Yameng Wang +5
2026-02-06
Computer Vision and Pattern Recognition · Computer Science
Revisiting MLLM Token Technology through the Lens of Classical Visual Coding
Jinming Liu, Junyan Lin, Yuntao Wei, Kele Shao +6
2025-08-20
Computer Vision and Pattern Recognition · Computer Science
METEOR: Multi-Encoder Collaborative Token Pruning for Efficient Vision Language Models
Yuchen Liu, Yaoming Wang, Bowen Shi, Xiaopeng Zhang +4
2025-07-29
Computer Vision and Pattern Recognition · Computer Science
Compression Tells Intelligence: Visual Coding, Visual Token Technology, and the Unification
Xin Jin, Jinming Liu, Yuntao Wei, Junyan Lin +5
2026-01-29
Computer Vision and Pattern Recognition · Computer Science
AdaTok: Adaptive Token Compression with Object-Aware Representations for Efficient Multimodal LLMs
Xinliang Zhang, Lei Zhu, Hangzhou He, Shuang Zeng +4
2025-11-25
Computer Vision and Pattern Recognition · Computer Science
ToFu: Visual Tokens Reduction via Fusion for Multi-modal, Multi-patch, Multi-image Task
Vittorio Pippi, Matthieu Guillaumin, Silvia Cascianelli, Rita Cucchiara +2
2025-03-07
Computer Vision and Pattern Recognition · Computer Science
LaCo: Efficient Layer-wise Compression of Visual Tokens for Multimodal Large Language Models
Juntao Liu, Liqiang Niu, Wenchao Chen, Jie Zhou +1
2025-07-04
Computer Vision and Pattern Recognition · Computer Science
FCoT-VL:Advancing Text-oriented Large Vision-Language Models with Efficient Visual Token Compression
Jianjian Li, Junquan Fan, Feng Tang, Gang Huang +5
2025-02-27
Machine Learning · Computer Science
Token Reduction Should Go Beyond Efficiency in Generative Models -- From Vision, Language to Multimodality
Zhenglun Kong, Yize Li, Fanhu Zeng, Lei Xin +6
2026-01-14
Computer Vision and Pattern Recognition · Computer Science
TrimTokenator-LC: Towards Adaptive Visual Token Pruning for Large Multimodal Models with Long Contexts
Hao Zhang, Mengsi Lyu, Bo Huang, Yulong Ao +1
2026-01-01