Related papers: Cluster Workload Allocation: Semantic Soft Affinit…

Semantic Scheduling for LLM Inference

Conventional operating system scheduling algorithms are largely content-ignorant, making decisions based on factors such as latency or fairness without considering the actual intents or semantics of processes. Consequently, these algorithms…

Machine Learning · Computer Science 2025-06-17 Wenyue Hua , Dujian Ding , Yile Gu , Yujie Ren , Kai Mei , Minghua Ma , William Yang Wang

ClusterFusion: Hybrid Clustering with Embedding Guidance and LLM Adaptation

Text clustering is a fundamental task in natural language processing, yet traditional clustering algorithms with pre-trained embeddings often struggle in domain-specific contexts without costly fine-tuning. Large language models (LLMs)…

Computation and Language · Computer Science 2025-12-05 Yiming Xu , Yuan Yuan , Vijay Viswanathan , Graham Neubig

ClusterLLM: Large Language Models as a Guide for Text Clustering

We introduce ClusterLLM, a novel text clustering framework that leverages feedback from an instruction-tuned large language model, such as ChatGPT. Compared with traditional unsupervised methods that builds upon "small" embedders,…

Computation and Language · Computer Science 2023-11-07 Yuwei Zhang , Zihan Wang , Jingbo Shang

Evaluating the Efficacy of LLM-Based Reasoning for Multiobjective HPC Job Scheduling

High-Performance Computing (HPC) job scheduling involves balancing conflicting objectives such as minimizing makespan, reducing wait times, optimizing resource use, and ensuring fairness. Traditional methods, including heuristic-based,…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-05 Prachi Jadhav , Hongwei Jin , Ewa Deelman , Prasanna Balaprakash

Leveraging Large Language Models for Exploiting ASR Uncertainty

While large language models excel in a variety of natural language processing (NLP) tasks, to perform well on spoken language understanding (SLU) tasks, they must either rely on off-the-shelf automatic speech recognition (ASR) systems for…

Computation and Language · Computer Science 2023-09-13 Pranay Dighe , Yi Su , Shangshang Zheng , Yunshu Liu , Vineet Garg , Xiaochuan Niu , Ahmed Tewfik

A Sparsity Predicting Approach for Large Language Models via Activation Pattern Clustering

Large Language Models (LLMs) exhibit significant activation sparsity, where only a subset of neurons are active for a given input. Although this sparsity presents opportunities to reduce computational cost, efficiently utilizing it requires…

Machine Learning · Computer Science 2025-07-22 Nobel Dhar , Bobin Deng , Md Romyull Islam , Xinyue Zhang , Kazi Fahim Ahmad Nasif , Kun Suo

Text Clustering as Classification with LLMs

Text clustering serves as a fundamental technique for organizing and interpreting unstructured textual data, particularly in contexts where manual annotation is prohibitively costly. With the rapid advancement of Large Language Models…

Computation and Language · Computer Science 2025-10-08 Chen Huang , Guoxiu He

Cluster Workload Allocation: A Predictive Approach Leveraging Machine Learning Efficiency

This research investigates how Machine Learning (ML) algorithms can assist in workload allocation strategies by detecting tasks with node affinity operators (referred to as constraint operators), which constrain their execution to a limited…

Machine Learning · Computer Science 2025-09-25 Leszek Sliwko

High-Throughput LLM inference on Heterogeneous Clusters

Nowadays, many companies possess various types of AI accelerators, forming heterogeneous clusters. Efficiently leveraging these clusters for high-throughput large language model (LLM) inference services can significantly reduce costs and…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-23 Yi Xiong , Jinqi Huang , Wenjie Huang , Xuebing Yu , Entong Li , Zhixiong Ning , Jinhua Zhou , Li Zeng , Xin Chen

Resource-Efficient Adaptation of Large Language Models for Text Embeddings via Prompt Engineering and Contrastive Fine-tuning

Large Language Models (LLMs) have become a cornerstone in Natural Language Processing (NLP), achieving impressive performance in text generation. Their token-level representations capture rich, human-aligned semantics. However, pooling…

Computation and Language · Computer Science 2025-09-25 Benedikt Roth , Stephan Rappensperger , Tianming Qiu , Hamza Imamović , Julian Wörmann , Hao Shen

Solving Data-centric Tasks using Large Language Models

Large language models (LLMs) are rapidly replacing help forums like StackOverflow, and are especially helpful for non-professional programmers and end users. These users are often interested in data-centric tasks, such as spreadsheet…

Programming Languages · Computer Science 2024-03-26 Shraddha Barke , Christian Poelitz , Carina Suzana Negreanu , Benjamin Zorn , José Cambronero , Andrew D. Gordon , Vu Le , Elnaz Nouri , Nadia Polikarpova , Advait Sarkar , Brian Slininger , Neil Toronto , Jack Williams

Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration

Large Language Models (LLMs) struggle with complex reasoning due to limited diversity and inefficient search. We propose Soft Reasoning, an embedding-based search framework that optimises the embedding of the first token to guide…

Computation and Language · Computer Science 2025-09-16 Qinglin Zhu , Runcong Zhao , Hanqi Yan , Yulan He , Yudong Chen , Lin Gui

Evaluating Large Language Models for Workload Mapping and Scheduling in Heterogeneous HPC Systems

Large language models (LLMs) are increasingly explored for their reasoning capabilities, yet their ability to perform structured, constraint-based optimization from natural language remains insufficiently understood. This study evaluates…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-18 Aasish Kumar Sharma , Julian Kunkel

AccLLM: Accelerating Long-Context LLM Inference Via Algorithm-Hardware Co-Design

Recently, large language models (LLMs) have achieved huge success in the natural language processing (NLP) field, driving a growing demand to extend their deployment from the cloud to edge devices. However, deploying LLMs on…

Hardware Architecture · Computer Science 2025-05-08 Yanbiao Liang , Huihong Shi , Haikuo Shao , Zhongfeng Wang

Small Language Models for Application Interactions: A Case Study

We study the efficacy of Small Language Models (SLMs) in facilitating application usage through natural language interactions. Our focus here is on a particular internal application used in Microsoft for cloud supply chain fulfilment. Our…

Computation and Language · Computer Science 2024-06-03 Beibin Li , Yi Zhang , Sébastien Bubeck , Jeevan Pathuri , Ishai Menache

Clustering-driven Memory Compression for On-device Large Language Models

Large language models (LLMs) often rely on user-specific memories distilled from past interactions to enable personalized generation. A common practice is to concatenate these memories with the input prompt, but this approach quickly…

Computation and Language · Computer Science 2026-01-27 Ondrej Bohdal , Pramit Saha , Umberto Michieli , Mete Ozay , Taha Ceritli

Semantic Caching and Intent-Driven Context Optimization for Multi-Agent Natural Language to Code Systems

We present a production-optimized multi-agent system designed to translate natural language queries into executable Python code for structured data analytics. Unlike systems that rely on expensive frontier models, our approach achieves high…

Software Engineering · Computer Science 2026-01-21 Harmohit Singh

Decision-Making with Lightweight Confidence-Aware Language Model for Autonomous Driving

Large Language Models (LLMs) and Multimodal LLMs (MLLMs) have demonstrated immense potential in autonomous driving (AD) by offering human-like reasoning and open-world generalization. However, the excessive computational overhead and high…

Robotics · Computer Science 2026-05-26 Ruoyu Yao , Ruiguo Zhong , Pei Liu , Mingxing Peng , Rui Yang , Jun Ma

SPILL: Domain-Adaptive Intent Clustering based on Selection and Pooling with Large Language Models

In this paper, we propose Selection and Pooling with Large Language Models (SPILL), an intuitive and domain-adaptive method for intent clustering without fine-tuning. Existing embeddings-based clustering methods rely on a few labeled…

Computation and Language · Computer Science 2025-06-03 I-Fan Lin , Faegheh Hasibi , Suzan Verberne

Analysis of Utterance Embeddings and Clustering Methods Related to Intent Induction for Task-Oriented Dialogue

The focus of this work is to investigate unsupervised approaches to overcome quintessential challenges in designing task-oriented dialog schema: assigning intent labels to each dialog turn (intent clustering) and generating a set of intents…

Computation and Language · Computer Science 2024-06-06 Jeiyoon Park , Yoonna Jang , Chanhee Lee , Heuiseok Lim