Pascal Poupart — Scifaro

Fill the GAP: A Granular Alignment Paradigm for Visual Reasoning in Multimodal Large Language Models

Visual latent reasoning lets a multimodal large language model (MLLM) create intermediate visual evidence as continuous tokens, avoiding external tools or image generators. However, existing methods usually follow an output-as-input latent…

Computer Vision and Pattern Recognition · Computer Science 2026-05-26 Yanting Miao , Yutao Sun , Dexin Wang , Mengyu Zhou , Pascal Poupart , Lei Lv , Qi Zhao , Li Wang , Hao Li , Xiaoxi Jiang , Guanjun Jiang

Talk, Judge, Cooperate: Gossip-Driven Indirect Reciprocity in Self-Interested LLM Agents

Indirect reciprocity, which means helping those who have helped others, is difficult to sustain among decentralized, self-interested LLM agents without reliable reputation systems. We address this challenge with the Agentic Linguistic…

Multiagent Systems · Computer Science 2026-05-20 Shuhui Zhu , Yue Lin , Shriya Kaistha , Wenhao Li , Baoxiang Wang , Hongyuan Zha , Gillian K. Hadfield , Pascal Poupart

The Reciprocity Gradient

Communication is fundamental to sustaining reciprocity and cooperation in strategic interactions. We identify and formulate the influence attribution problem as the central optimization difficulty inherent in such dynamics for a learning…

Machine Learning · Computer Science 2026-05-12 Yue Lin , Pascal Poupart , Shuhui Zhu , Dan Qiao , Wenhao Li , Yuan Liu , Hongyuan Zha , Baoxiang Wang

A Practical Algorithm for Feature-Rich, Non-Stationary Bandit Problems

Contextual bandits are incredibly useful in many practical problems. We go one step further by devising a more realistic problem that combines: (1) contextual bandits with dense arm features, (2) non-linear reward functions, and (3) a…

Machine Learning · Computer Science 2026-03-18 Wei Min Loh , Sajib Kumer Sinha , Ankur Agarwal , Pascal Poupart

Policy-Conditioned Policies for Multi-Agent Task Solving

In multi-agent tasks, the central challenge lies in the dynamic adaptation of strategies. However, directly conditioning on opponents' strategies is intractable in the prevalent deep reinforcement learning paradigm due to a fundamental…

Computer Science and Game Theory · Computer Science 2025-12-25 Yue Lin , Shuhui Zhu , Wenhao Li , Ang Li , Dan Qiao , Pascal Poupart , Hongyuan Zha , Baoxiang Wang

Image-POSER: Reflective RL for Multi-Expert Image Generation and Editing

Recent advances in text-to-image generation have produced strong single-shot models, yet no individual system reliably executes the long, compositional prompts typical of creative workflows. We introduce Image-POSER, a reflective…

Computer Vision and Pattern Recognition · Computer Science 2025-11-18 Hossein Mohebbi , Mohammed Abdulrahman , Yanting Miao , Pascal Poupart , Suraj Kothawade

Chrysalis: A Unified System for Comparing Active Teaching and Passive Learning with AI Agents in Education

AI-assisted learning has seen a remarkable uptick over the last few years, mainly due to the rise in popularity of Large Language Models (LLMs). Their ability to hold long-form, natural language interactions with users makes them excellent…

Human-Computer Interaction · Computer Science 2025-10-08 Prashanth Arun , Vinita Vader , Erya Xu , Brent McCready-Branch , Sarah Seabrook , Kyle Scholz , Ana Crisan , Igor Grossmann , Pascal Poupart

Time Is Effort: Estimating Human Post-Editing Time for Grammar Error Correction Tool Evaluation

Text editing can involve several iterations of revision. Incorporating an efficient Grammar Error Correction (GEC) tool in the initial correction round can significantly impact further human editing effort and final text quality. This…

Computation and Language · Computer Science 2025-10-07 Ankit Vadehra , Bill Johnson , Gene Saunders , Pascal Poupart

A Critical Look At Tokenwise Reward-Guided Text Generation

Large language models (LLMs) can be improved by aligning with human preferences through fine-tuning -- the so-called reinforcement learning from human feedback (RLHF). However, the cost of fine-tuning an LLM is prohibitive for many users.…

Machine Learning · Computer Science 2025-09-29 Ahmad Rashid , Ruotian Wu , Julia Grosse , Agustinus Kristiadi , Pascal Poupart

Uncertainty-Guided Likelihood Tree Search

Tree search is a fundamental tool for planning, as many sequential decision-making problems can be framed as searching over tree-structured spaces. We propose an uncertainty-guided tree search algorithm for settings where the reward…

Machine Learning · Computer Science 2025-09-05 Julia Grosse , Ruotian Wu , Ahmad Rashid , Cheng Zhang , Philipp Hennig , Pascal Poupart , Agustinus Kristiadi

Towards Cost-Effective Reward Guided Text Generation

Reward-guided text generation (RGTG) has emerged as a viable alternative to offline reinforcement learning from human feedback (RLHF). RGTG methods can align baseline language models to human preferences without further training like in…

Machine Learning · Computer Science 2025-07-08 Ahmad Rashid , Ruotian Wu , Rongqi Fan , Hongliang Li , Agustinus Kristiadi , Pascal Poupart

Basis Transformers for Multi-Task Tabular Regression

Dealing with tabular data is challenging due to partial information, noise, and heterogeneous structure. Existing techniques often struggle to simultaneously address key aspects of tabular data such as textual information, a variable number…

Machine Learning · Computer Science 2025-06-10 Wei Min Loh , Jiaqi Shang , Pascal Poupart

Information Bargaining: Bilateral Commitment in Bayesian Persuasion

Bayesian persuasion, an extension of cheap-talk communication, involves an informed sender committing to a signaling scheme to influence a receiver's actions. Compared to cheap talk, this sender's commitment enables the receiver to verify…

Computer Science and Game Theory · Computer Science 2025-06-10 Yue Lin , Shuhui Zhu , William A Cunningham , Wenhao Li , Pascal Poupart , Hongyuan Zha , Baoxiang Wang

Reflect-then-Plan: Offline Model-Based Planning through a Doubly Bayesian Lens

Offline reinforcement learning (RL) is crucial when online exploration is costly or unsafe but often struggles with high epistemic uncertainty due to limited data. Existing methods rely on fixed conservative policies, restricting adaptivity…

Artificial Intelligence · Computer Science 2025-06-09 Jihwan Jeong , Xiaoyu Wang , Jingmin Wang , Scott Sanner , Pascal Poupart

Simplifying Bayesian Optimization Via In-Context Direct Optimum Sampling

The optimization of expensive black-box functions is ubiquitous in science and engineering. A common solution to this problem is Bayesian optimization (BO), which is generally comprised of two components: (i) a surrogate model and (ii) an…

Machine Learning · Computer Science 2025-06-02 Gustavo Sutter Pessurno de Carvalho , Mohammed Abdulrahman , Hao Wang , Sriram Ganapathi Subramanian , Marc St-Aubin , Sharon O'Sullivan , Lawrence Wan , Luis Ricardez-Sandoval , Pascal Poupart , Agustinus Kristiadi

Measures of Variability for Risk-averse Policy Gradient

Risk-averse reinforcement learning (RARL) is critical for decision-making under uncertainty, which is especially valuable in high-stake applications. However, most existing works focus on risk measures, e.g., conditional value-at-risk…

Machine Learning · Computer Science 2025-04-16 Yudong Luo , Yangchen Pan , Jiaqi Tan , Pascal Poupart

Learning to Negotiate via Voluntary Commitment

The partial alignment and conflict of autonomous agents lead to mixed-motive scenarios in many real-world applications. However, agents may fail to cooperate in practice even when cooperation yields a better outcome. One well known reason…

Artificial Intelligence · Computer Science 2025-03-20 Shuhui Zhu , Baoxiang Wang , Sriram Ganapathi Subramanian , Pascal Poupart

A Comprehensive Survey on Inverse Constrained Reinforcement Learning: Definitions, Progress and Challenges

Inverse Constrained Reinforcement Learning (ICRL) is the task of inferring the implicit constraints that expert agents adhere to, based on their demonstration data. As an emerging research topic, ICRL has received considerable attention in…

Machine Learning · Computer Science 2025-02-04 Guiliang Liu , Sheng Xu , Shicheng Liu , Ashish Gaurav , Sriram Ganapathi Subramanian , Pascal Poupart

Subject-driven Text-to-Image Generation via Preference-based Reinforcement Learning

Text-to-image generative models have recently attracted considerable interest, enabling the synthesis of high-quality images from textual prompts. However, these models often lack the capability to generate specific subjects from given…

Computer Vision and Pattern Recognition · Computer Science 2024-12-24 Yanting Miao , William Loh , Suraj Kothawade , Pascal Poupart , Abdullah Rashwan , Yeqing Li

Learning Soft Driving Constraints from Vectorized Scene Embeddings while Imitating Expert Trajectories

The primary goal of motion planning is to generate safe and efficient trajectories for vehicles. Traditionally, motion planning models are trained using imitation learning to mimic the behavior of human experts. However, these models often…

Robotics · Computer Science 2024-12-10 Niloufar Saeidi Mobarakeh , Behzad Khamidehi , Chunlin Li , Hamidreza Mirkhani , Fazel Arasteh , Mohammed Elmahgiubi , Weize Zhang , Kasra Rezaee , Pascal Poupart