Ness B. Shroff — Scifaro

Regret Bounds for Reinforcement Learning from Multi-Source Imperfect Preferences

Reinforcement learning from human feedback (RLHF) replaces hard-to-specify rewards with pairwise trajectory preferences, yet regret-oriented theory often assumes that preference labels are generated consistently from a single ground-truth…

Machine Learning · Computer Science 2026-04-03 Ming Shi , Yingbin Liang , Ness B. Shroff , Ananthram Swami

Beyond Freshness and Semantics: A Coupon-Collector Framework for Effective Status Updates

For status update systems operating over unreliable energy-constrained wireless channels, we address Weaver's long-standing Level-C question: do my packets actually improve the plant's behavior? Each fresh sample carries a stochastic…

Systems and Control · Electrical Eng. & Systems 2026-03-31 Youssef Ahmed , Arnob Ghosh , Chih-Chun Wang , Ness B. Shroff

An LP-based Sampling Policy for Multi-Armed Bandits with Side-Observations and Stochastic Availability

We study the stochastic multi-armed bandit (MAB) problem where an underlying network structure enables side-observations across related actions. We use a bipartite graph to link actions to a set of unknowns, such that selecting an action…

Machine Learning · Computer Science 2026-03-30 Ashutosh Soni , Peizhong Ju , Atilla Eryilmaz , Ness B. Shroff

Escaping Offline Pessimism: Vector-Field Reward Shaping for Safe Frontier Exploration

While offline reinforcement learning provides reliable policies for real-world deployment, its inherent pessimism severely restricts an agent's ability to explore and collect novel data online. Drawing inspiration from safe reinforcement…

Machine Learning · Computer Science 2026-03-20 Amirhossein Roknilamouki , Arnob Ghosh , Eylem Ekici , Ness B. Shroff

Constraint-Rectified Training for Efficient Chain-of-Thought

Chain-of-Thought (CoT) has significantly enhanced the reasoning capabilities of Large Language Models (LLMs), especially when combined with reinforcement learning (RL) based post-training methods. While longer reasoning traces can improve…

Machine Learning · Computer Science 2026-02-16 Qinhang Wu , Sen Lin , Ming Zhang , Yingbin Liang , Ness B. Shroff

Near-Optimal Partially Observable Reinforcement Learning with Partial Online State Information

Partially observable Markov decision processes (POMDPs) are a general framework for sequential decision-making under latent state uncertainty, yet learning in POMDPs is intractable in the worst case. Motivated by sensing and probing…

Machine Learning · Computer Science 2026-01-27 Ming Shi , Yingbin Liang , Ness B. Shroff

Delay-Optimal Transmission Scheduling Policies for Time-Correlated Fading Channels

Millimeter-wave (mmWave) networks have the potential to support high throughput and low-latency requirements of 5G-and-beyond communication standards. But transmissions in this band are highly vulnerable to attenuation and blockages from…

Optimization and Control · Mathematics 2025-11-25 Manali Dutta , Gourav Saha , Rahul Singh , Ness B. Shroff

Provably Efficient Multi-Objective Bandit Algorithms under Preference-Centric Customization

Multi-objective multi-armed bandit (MO-MAB) problems traditionally aim to achieve Pareto optimality. However, real-world scenarios often involve users with varying preferences across objectives, resulting in a Pareto-optimal arm that may…

Machine Learning · Computer Science 2025-11-18 Linfeng Cao , Ming Shi , Ness B. Shroff

Mixture-of-Transformers Learn Faster: A Theoretical Study on Classification Problems

Mixture-of-Experts (MoE) models improve transformer efficiency but lack a unified theoretical explanation, especially when both feed-forward and attention layers are allowed to specialize. To this end, we study the Mixture-of-Transformers…

Machine Learning · Computer Science 2025-11-03 Hongbo Li , Qinhang Wu , Sen Lin , Yingbin Liang , Ness B. Shroff

Monitoring State Transitions in Markovian Systems with Sampling Cost

We consider a node-monitor pair, where the node's state varies with time. The monitor needs to track the node's state at all times; however, there is a fixed cost for each state query. So the monitor may instead predict the state using…

Machine Learning · Computer Science 2025-10-28 Kumar Saurav , Ness B. Shroff , Yingbin Liang

Online Learning for Optimizing AoI-Energy Tradeoff under Unknown Channel Statistics

We consider a real-time monitoring system where a source node (with energy limitations) aims to keep the information status at a destination node as fresh as possible by scheduling status update transmissions over a set of channels. The…

Networking and Internet Architecture · Computer Science 2025-09-24 Mohamed A. Abd-Elmagid , Ming Shi , Eylem Ekici , Ness B. Shroff

Provably Efficient RL for Linear MDPs under Instantaneous Safety Constraints in Non-Convex Feature Spaces

In Reinforcement Learning (RL), tasks with instantaneous hard constraints present significant challenges, particularly when the decision space is non-convex or non-star-convex. This issue is especially relevant in domains like autonomous…

Machine Learning · Computer Science 2025-02-27 Amirhossein Roknilamouki , Arnob Ghosh , Ming Shi , Fatemeh Nourzad , Eylem Ekici , Ness B. Shroff

Theory on Mixture-of-Experts in Continual Learning

Continual learning (CL) has garnered significant attention because of its ability to adapt to new tasks that arrive over time. Catastrophic forgetting (of old tasks) has been identified as a major issue in CL, as the model adapts to new…

Machine Learning · Computer Science 2025-02-20 Hongbo Li , Sen Lin , Lingjie Duan , Yingbin Liang , Ness B. Shroff

How to Find the Exact Pareto Front for Multi-Objective MDPs?

Multi-Objective Markov Decision Processes (MO-MDPs) are receiving increasing attention, as real-world decision-making problems often involve conflicting objectives that cannot be addressed by a single-objective MDP. The Pareto front…

Machine Learning · Computer Science 2025-02-11 Yining Li , Peizhong Ju , Ness B. Shroff

BeST -- A Novel Source Selection Metric for Transfer Learning

One of the most fundamental, and yet relatively less explored, goals in transfer learning is the efficient means of selecting top candidates from a large number of previously trained models (optimized for various "source" tasks) that would…

Machine Learning · Computer Science 2025-01-22 Ashutosh Soni , Peizhong Ju , Atilla Eryilmaz , Ness B. Shroff

Prediction-Assisted Online Distributed Deep Learning Workload Scheduling in GPU Clusters

The recent explosive growth of deep learning (DL) models has necessitated a compelling need for efficient job scheduling for distributed deep learning training with mixed parallelisms (DDLwMP) in GPU clusters. This paper proposes an…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-01-13 Ziyue Luo , Jia Liu , Myungjin Lee , Ness B. Shroff

Balancing Queueing and Retransmission: Latency-Optimal Massive MIMO Design

One fundamental challenge in 5G URLLC is how to optimize massive MIMO systems for achieving low latency and high reliability. A natural design choice to maximize reliability and minimize retransmission is to select the lowest allowed target…

Information Theory · Computer Science 2024-10-30 Xu Du , Yin Sun , Ness B. Shroff , Ashutosh Sabharwal

Sampling for Remote Estimation of the Wiener Process over an Unreliable Channel

In this paper, we study a sampling problem where a source takes samples from a Wiener process and transmits them through a wireless channel to a remote estimator. Due to channel fading, interference, and potential collisions, the packet…

Information Theory · Computer Science 2023-10-19 Jiayu Pan , Yin Sun , Ness B. Shroff

Near Delay-Optimal Scheduling of Batch Jobs in Multi-Server Systems

We study a class of scheduling problems, where each job is divided into a batch of unit-size tasks and these tasks can be executed in parallel on multiple servers with New-Better-than-Used (NBU) service time distributions. While many delay…

Networking and Internet Architecture · Computer Science 2023-10-02 Yin Sun , C. Emre Koksal , Ness B. Shroff

Generalization Performance of Transfer Learning: Overparameterized and Underparameterized Regimes

Transfer learning is a useful technique for achieving improved performance and reducing training costs by leveraging the knowledge gained from source tasks and applying it to target tasks. Assessing the effectiveness of transfer learning…

Machine Learning · Computer Science 2023-06-12 Peizhong Ju , Sen Lin , Mark S. Squillante , Yingbin Liang , Ness B. Shroff