Tiansheng Wen — Scifaro

No More K-means:Single-Stage Sparse Coding for Efficient Multi-Vector Retrieval

Multi-vector retrieval (MVR) models, exemplified by ColBERT, have established new benchmarks in retrieval accuracy by preserving fine-grained token-level interactions. However, this granularity imposes prohibitive storage and retrieval…

Information Retrieval · Computer Science 2026-05-29 Lixuan Guo , Yifei Wang , Tiansheng Wen , Aosong Feng , Stefanie Jegelka , Chenyu You

Scaling Attention via Feature Sparsity

Scaling Transformers to ultra-long contexts is bottlenecked by the $O(n^2 d)$ cost of self-attention. Existing methods reduce this cost along the sequence axis through local windows, kernel approximations, or token-level sparsity, but these…

Machine Learning · Computer Science 2026-03-31 Yan Xie , Tiansheng Wen , Tangda Huang , Bo Chen , Chenyu You , Stefanie Jegelka , Yifei Wang

Route Experts by Sequence, not by Token

Mixture-of-Experts (MoE) architectures scale large language models (LLMs) by activating only a subset of experts per token, but the standard TopK routing assigns the same fixed number of experts to all tokens, ignoring their varying…

Machine Learning · Computer Science 2026-03-30 Tiansheng Wen , Yifei Wang , Aosong Feng , Long Ma , Xinyang Liu , Yifan Wang , Lixuan Guo , Bo Chen , Stefanie Jegelka , Chenyu You

CSRv2: Unlocking Ultra-Sparse Embeddings

In the era of large foundation models, the quality of embeddings has become a central determinant of downstream task performance and overall system capability. Yet widely used dense embeddings are often extremely high-dimensional, incurring…

Machine Learning · Computer Science 2026-03-03 Lixuan Guo , Yifei Wang , Tiansheng Wen , Yifan Wang , Aosong Feng , Bo Chen , Stefanie Jegelka , Chenyu You

Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation

Many large-scale systems rely on high-quality deep representations (embeddings) to facilitate tasks like retrieval, search, and generative modeling. Matryoshka Representation Learning (MRL) recently emerged as a solution for adaptive…

Machine Learning · Computer Science 2025-05-21 Tiansheng Wen , Yifei Wang , Zequn Zeng , Zhong Peng , Yudi Su , Xinyang Liu , Bo Chen , Hongwei Liu , Stefanie Jegelka , Chenyu You

Explaining Domain Shifts in Language: Concept erasing for Interpretable Image Classification

Concept-based models can map black-box representations to human-understandable concepts, which makes the decision-making process more transparent and then allows users to understand the reason behind predictions. However, domain-specific…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Zequn Zeng , Yudi Su , Jianqiao Sun , Tiansheng Wen , Hao Zhang , Zhengjue Wang , Bo Chen , Hongwei Liu , Jiawei Ma

A Non-negative VAE:the Generalized Gamma Belief Network

The gamma belief network (GBN), often regarded as a deep topic model, has demonstrated its potential for uncovering multi-layer interpretable latent representations in text data. Its notable capability to acquire interpretable latent…

Machine Learning · Computer Science 2024-08-16 Zhibin Duan , Tiansheng Wen , Muyao Wang , Bo Chen , Mingyuan Zhou

Contrastive Factor Analysis

Factor analysis, often regarded as a Bayesian variant of matrix factorization, offers superior capabilities in capturing uncertainty, modeling complex dependencies, and ensuring robustness. As the deep learning era arrives, factor analysis…

Machine Learning · Computer Science 2024-08-02 Zhibin Duan , Tiansheng Wen , Yifei Wang , Chen Zhu , Bo Chen , Mingyuan Zhou

HICEScore: A Hierarchical Metric for Image Captioning Evaluation

Image captioning evaluation metrics can be divided into two categories, reference-based metrics and reference-free metrics. However, reference-based approaches may struggle to evaluate descriptive captions with abundant visual details…

Computer Vision and Pattern Recognition · Computer Science 2024-07-29 Zequn Zeng , Jianqiao Sun , Hao Zhang , Tiansheng Wen , Yudi Su , Yan Xie , Zhengjue Wang , Bo Chen