English
Related papers

Related papers: ESMC: MLLM-Based Embedding Selection for Explainab…

200 papers

Text clustering is an important method for organising the increasing volume of digital content, aiding in the structuring and discovery of hidden patterns in uncategorised data. The effectiveness of text clustering largely depends on the…

Computation and Language · Computer Science 2024-12-06 Alina Petukhova , João P. Matos-Carvalho , Nuno Fachada

Clustering short text is a difficult problem, due to the low word co-occurrence between short text documents. This work shows that large language models (LLMs) can overcome the limitations of traditional clustering approaches by generating…

Computation and Language · Computer Science 2025-04-08 Justin K. Miller , Tristram J. Alexander

Text clustering serves as a fundamental technique for organizing and interpreting unstructured textual data, particularly in contexts where manual annotation is prohibitively costly. With the rapid advancement of Large Language Models…

Computation and Language · Computer Science 2025-10-08 Chen Huang , Guoxiu He

Unlike traditional unsupervised clustering, semi-supervised clustering allows users to provide meaningful structure to the data, which helps the clustering algorithm to match the user's intent. Existing approaches to semi-supervised…

Computation and Language · Computer Science 2023-07-04 Vijay Viswanathan , Kiril Gashteovski , Carolin Lawrence , Tongshuang Wu , Graham Neubig

Text clustering is a fundamental task in natural language processing, yet traditional clustering algorithms with pre-trained embeddings often struggle in domain-specific contexts without costly fine-tuning. Large language models (LLMs)…

Computation and Language · Computer Science 2025-12-05 Yiming Xu , Yuan Yuan , Vijay Viswanathan , Graham Neubig

Large Language Models (LLMs) have become a cornerstone in Natural Language Processing (NLP), achieving impressive performance in text generation. Their token-level representations capture rich, human-aligned semantics. However, pooling…

Computation and Language · Computer Science 2025-09-25 Benedikt Roth , Stephan Rappensperger , Tianming Qiu , Hamza Imamović , Julian Wörmann , Hao Shen

Large Language Models (LLMs) are reshaping unsupervised learning by offering an unprecedented ability to perform text clustering based on their deep semantic understanding. However, their direct application is fundamentally limited by a…

Computation and Language · Computer Science 2026-04-08 Yuanjie Zhu , Liangwei Yang , Ke Xu , Weizhi Zhang , Zihe Song , Jindong Wang , Philip S. Yu

Large language models (LLMs) have achieved remarkable success across various domains, but effectively incorporating complex and potentially noisy user timeline data into LLMs remains a challenge. Current approaches often involve translating…

Computation and Language · Computer Science 2024-09-11 Lin Ning , Luyang Liu , Jiaxing Wu , Neo Wu , Devora Berlowitz , Sushant Prakash , Bradley Green , Shawn O'Banion , Jun Xie

Large language models (LLMs) often rely on user-specific memories distilled from past interactions to enable personalized generation. A common practice is to concatenate these memories with the input prompt, but this approach quickly…

Computation and Language · Computer Science 2026-01-27 Ondrej Bohdal , Pramit Saha , Umberto Michieli , Mete Ozay , Taha Ceritli

Many cultural institutions have made large digitized visual collections available online, often under permissible re-use licences. Creating interfaces for exploring and searching these collections is difficult, particularly in the absence…

Computer Vision and Pattern Recognition · Computer Science 2024-11-08 Taylor Arnold , Lauren Tilton

Large Language Models (LLMs) have emerged as promising recommendation systems, offering novel ways to model user preferences through generative approaches. However, many existing methods often rely solely on text semantics or incorporate…

Machine Learning · Computer Science 2026-01-09 Mir Rayat Imtiaz Hossain , Leo Feng , Leonid Sigal , Mohamed Osama Ahmed

Multi-modal Large Language Models (MLLMs) have demonstrated remarkable capabilities in understanding and generating content across various modalities, such as images and text. However, their interpretability remains a challenge, hindering…

Computer Vision and Pattern Recognition · Computer Science 2024-05-29 Loris Giulivi , Giacomo Boracchi

Transformer-based large language models (LLMs) rely on contextual embeddings which generate different (continuous) representations for the same token depending on its surrounding context. Nonetheless, words and tokens typically have a…

Computation and Language · Computer Science 2025-07-10 Qitong Wang , Mohammed J. Zaki , Georgios Kollias , Vasileios Kalantzis

Embeddings have become a pivotal means to represent complex, multi-faceted information about entities, concepts, and relationships in a condensed and useful format. Nevertheless, they often preclude direct interpretation. While downstream…

Recent large language models (LLMs) have demonstrated exceptional performance on general-purpose text embedding tasks. While dense embeddings have dominated related research, we introduce the first lexicon-based embeddings (LENS) leveraging…

Computation and Language · Computer Science 2026-03-20 Yibin Lei , Tao Shen , Yu Cao , Andrew Yates

We present a clustering-based language model using word embeddings for text readability prediction. Presumably, an Euclidean semantic space hypothesis holds true for word embeddings whose training is done by observing word co-occurrences.…

Computation and Language · Computer Science 2017-09-07 Miriam Cha , Youngjune Gwon , H. T. Kung

Subspace clustering has been extensively studied from the hypothesis-and-test, algebraic, and spectral clustering based perspectives. Most assume that only a single type/class of subspace is present. Generalizations to multiple types are…

Computer Vision and Pattern Recognition · Computer Science 2019-04-04 Xun Xu , Loong-Fah Cheong , Zhuwen Li

Multimodal Large Language Models (MLLMs) have become increasingly important due to their state-of-the-art performance and ability to integrate multiple data modalities, such as text, images, and audio, to perform complex tasks with high…

Large Language Models (LLMs) are becoming increasingly popular in pervasive computing due to their versatility and strong performance. However, despite their ubiquitous use, the exact mechanisms underlying their outstanding performance…

Computation and Language · Computer Science 2026-02-02 Alhassan Abdelhalim , Janick Edinger , Sören Laue , Michaela Regneri

We introduce ClusterLLM, a novel text clustering framework that leverages feedback from an instruction-tuned large language model, such as ChatGPT. Compared with traditional unsupervised methods that builds upon "small" embedders,…

Computation and Language · Computer Science 2023-11-07 Yuwei Zhang , Zihan Wang , Jingbo Shang
‹ Prev 1 2 3 10 Next ›