Related papers: MOSEL: Inference Serving Using Dynamic Modality Se…

Inference-Time Dynamic Modality Selection for Incomplete Multimodal Classification

Multimodal deep learning (MDL) has achieved remarkable success across various domains, yet its practical deployment is often hindered by incomplete multimodal data. Existing incomplete MDL methods either discard missing modalities, risking…

Computer Vision and Pattern Recognition · Computer Science 2026-05-14 Siyi Du , Xinzhe Luo , Declan P. O'Regan , Chen Qin

Sample Efficient Robot Learning in Supervised Effect Prediction Tasks

In self-supervised robotic learning, agents acquire data through active interaction with their environment, incurring costs such as energy use, human oversight, and experimental time. To mitigate these, sample-efficient exploration is…

Robotics · Computer Science 2025-05-29 Mehmet Arda Eren , Erhan Oztop

DeepSuM: Deep Sufficient Modality Learning Framework

Multimodal learning has become a pivotal approach in developing robust learning models with applications spanning multimedia, robotics, large language models, and healthcare. The efficiency of multimodal systems is a critical concern, given…

Machine Learning · Computer Science 2025-03-04 Zhe Gao , Jian Huang , Ting Li , Xueqin Wang

SneakPeek: Data-Aware Model Selection and Scheduling for Inference Serving on the Edge

Modern applications increasingly rely on inference serving systems to provide low-latency insights with a diverse set of machine learning models. Existing systems often utilize resource elasticity to scale with demand. However, many…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-13 Joel Wolfrath , Daniel Frink , Abhishek Chandra

A Survey on Inference Optimization Techniques for Mixture of Experts Models

The emergence of large-scale Mixture of Experts (MoE) models represents a significant advancement in artificial intelligence, offering enhanced model capacity and computational efficiency through conditional computation. However, deploying…

Machine Learning · Computer Science 2025-01-23 Jiacheng Liu , Peng Tang , Wenfeng Wang , Yuhang Ren , Xiaofeng Hou , Pheng-Ann Heng , Minyi Guo , Chao Li

Reconciling High Accuracy, Cost-Efficiency, and Low Latency of Inference Serving Systems

The use of machine learning (ML) inference for various applications is growing drastically. ML inference services engage with users directly, requiring fast and accurate responses. Moreover, these services face dynamic workloads of…

Machine Learning · Computer Science 2023-05-22 Mehran Salmani , Saeid Ghafouri , Alireza Sanaee , Kamran Razavi , Max Mühlhäuser , Joseph Doyle , Pooyan Jamshidi , Mohsen Sharifi

Multi-Modality Spatio-Temporal Forecasting via Self-Supervised Learning

Multi-modality spatio-temporal (MoST) data extends spatio-temporal (ST) data by incorporating multiple modalities, which is prevalent in monitoring systems, encompassing diverse traffic demands and air quality assessments. Despite…

Machine Learning · Computer Science 2024-05-07 Jiewen Deng , Renhe Jiang , Jiaqi Zhang , Xuan Song

Training and Serving Machine Learning Models at Scale

In recent years, Web services are becoming more and more intelligent (e.g., in understanding user preferences) thanks to the integration of components that rely on Machine Learning (ML). Before users can interact (inference phase) with an…

Software Engineering · Computer Science 2022-11-11 Luciano Baresi , Giovanni Quattrocchi

Multimodal Learning for MIMO Beam Prediction Based on Variational Inference

Accurate beam prediction is essential for mitigating signalling overhead and latency in integrated sensing and communication-enabled massive multi-input multi-output systems. With the aid of multimodal learning, the prediction accuracy can…

Signal Processing · Electrical Eng. & Systems 2026-05-15 Zijian Zheng , Wenqiang Yi , Hyundong Shin , Arumugam Nallanathan

Which is Making the Contribution: Modulating Unimodal and Cross-modal Dynamics for Multimodal Sentiment Analysis

Multimodal sentiment analysis (MSA) draws increasing attention with the availability of multimodal data. The boost in performance of MSA models is mainly hindered by two problems. On the one hand, recent MSA works mostly focus on learning…

Machine Learning · Computer Science 2021-11-17 Ying Zeng , Sijie Mai , Haifeng Hu

Towards Designing a Self-Managed Machine Learning Inference Serving System inPublic Cloud

We are witnessing an increasing trend towardsusing Machine Learning (ML) based prediction systems, span-ning across different application domains, including productrecommendation systems, personal assistant devices, facialrecognition, etc.…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-08-24 Jashwant Raj Gunasekaran , Prashanth Thinakaran , Cyan Subhra Mishra , Mahmut Taylan Kandemir , Chita R. Das

AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition

Multi-modal learning, which focuses on utilizing various modalities to improve the performance of a model, is widely used in video recognition. While traditional multi-modal learning offers excellent recognition results, its computational…

Computer Vision and Pattern Recognition · Computer Science 2021-05-13 Rameswar Panda , Chun-Fu Chen , Quanfu Fan , Ximeng Sun , Kate Saenko , Aude Oliva , Rogerio Feris

Lynx: Enabling Efficient MoE Inference through Dynamic Batch-Aware Expert Selection

Selective parameter activation provided by Mixture-of-Expert (MoE) models have made them a popular choice in modern foundational models. However, MoEs face a fundamental tension when employed for serving. Batching, critical for performance…

Machine Learning · Computer Science 2026-05-20 Vima Gupta , Jae Hyung Ju , Kartik Sinha , Ada Gavrilovska , Anand Padmanabha Iyer

Meta-Sel: Efficient Demonstration Selection for In-Context Learning via Supervised Meta-Learning

Demonstration selection is a practical bottleneck in in-context learning (ICL): under a tight prompt budget, accuracy can change substantially depending on which few-shot examples are included, yet selection must remain cheap enough to run…

Machine Learning · Computer Science 2026-02-13 Xubin Wang , Weijia Jia

Multimodal Remote Inference

We consider a remote inference system with multiple modalities, where a multimodal machine learning (ML) model performs real-time inference using features collected from remote sensors. When sensor observations evolve dynamically over time,…

Machine Learning · Computer Science 2026-04-28 Keyuan Zhang , Yin Sun , Bo Ji

Multimodal-Guided Dynamic Dataset Pruning for Robust and Efficient Data-Centric Learning

Modern deep models are trained on large real-world datasets, where data quality varies and redundancy is common. Data-centric approaches such as dataset pruning have shown promise in improving training efficiency and model performance.…

Machine Learning · Computer Science 2025-07-18 Suorong Yang , Peijia Li , Yujie Liu , Zhiming Xu , Peng Ye , Wanli Ouyang , Furao Shen , Dongzhan Zhou

INFaaS: A Model-less and Managed Inference Serving System

Despite existing work in machine learning inference serving, ease-of-use and cost efficiency remain challenges at large scales. Developers must manually search through thousands of model-variants -- versions of already-trained models that…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-09-07 Francisco Romero , Qian Li , Neeraja J. Yadwadkar , Christos Kozyrakis

QuickSel: Quick Selectivity Learning with Mixture Models

Estimating the selectivity of a query is a key step in almost any cost-based query optimizer. Most of today's databases rely on histograms or samples that are periodically refreshed by re-scanning the data as the underlying data changes.…

Databases · Computer Science 2020-04-14 Yongjoo Park , Shucheng Zhong , Barzan Mozafari

Dynamic Modality Scheduling for Multimodal Large Models via Confidence, Uncertainty, and Semantic Consistency

Multimodal Large Models (MLLMs) have achieved remarkable progress in vision-language understanding and generation tasks. However, existing MLLMs typically rely on static modality fusion strategies, which treat all modalities equally…

Computer Vision and Pattern Recognition · Computer Science 2025-06-17 Hiroshi Tanaka , Anika Rao , Hana Satou , Michael Johnson , Sofia García

Mixture of Sequence: Theme-Aware Mixture-of-Experts for Long-Sequence Recommendation

Sequential recommendation has rapidly advanced in click-through rate prediction due to its ability to model dynamic user interests. A key challenge, however, lies in modeling long sequences: users often exhibit significant interest shifts,…

Information Retrieval · Computer Science 2026-04-24 Xiao Lin , Zhicheng Tang , Weilin Cong , Mengyue Hang , Kai Wang , Yajuan Wang , Zhichen Zeng , Ting-Wei Li , Hyunsik Yoo , Zhining Liu , Xuying Ning , Ruizhong Qiu , Wen-yen Chen , Shuo Chang , Rong Jin , Huayu Li , Hanghang Tong