English
Related papers

Related papers: IMO$^3$: Interactive Multi-Objective Off-Policy Op…

200 papers

Multi-objective optimization is a type of decision making problems where multiple conflicting objectives are optimized. We study offline optimization of multi-objective policies from data collected by an existing policy. We propose a…

Machine Learning · Computer Science 2023-10-31 Shima Alizadeh , Aniruddha Bhargava , Karthick Gopalswamy , Lalit Jain , Branislav Kveton , Ge Liu

Offline optimization aims to maximize a black-box objective function with a static dataset and has wide applications. In addition to the objective function being black-box and expensive to evaluate, numerous complex real-world problems…

Machine Learning · Computer Science 2024-06-07 Ke Xue , Rong-Xi Tan , Xiaobin Huang , Chao Qian

Effective optimization is essential for real-world interactive systems to provide a satisfactory user experience in response to changing user behavior. However, it is often challenging to find an objective to optimize for interactive…

Artificial Intelligence · Computer Science 2020-06-24 Ziming Li , Julia Kiseleva , Alekh Agarwal , Maarten de Rijke , Ryen W. White

In multi-objective optimization problems, there might exist hidden objectives that are important to the decision-maker but are not being optimized. On the other hand, there might also exist irrelevant objectives that are being optimized but…

Optimization and Control · Mathematics 2022-06-06 Seyed Mahdi Shavarani , Manuel López-Ibáñez , Richard Allmendinger

Proximal Policy Optimization (PPO) is a popular model-free reinforcement learning algorithm, esteemed for its simplicity and efficacy. However, due to its inherent on-policy nature, its proficiency in harnessing data from disparate policies…

Machine Learning · Computer Science 2024-06-07 Yaozhong Gan , Renye Yan , Xiaoyang Tan , Zhe Wu , Junliang Xing

Off-policy evaluation (OPE) estimates the value of a target treatment policy (e.g., a recommender system) using data collected by a different logging policy. It enables high-stakes experimentation without live deployment, yet in practice…

Machine Learning · Statistics 2026-05-18 Connor Douglas , Joel Persson , Foster Provost

In real-world problems, uncertainties (e.g., errors in the measurement, precision errors) often lead to poor performance of numerical algorithms when not explicitly taken into account. This is also the case for control problems, where…

Optimization and Control · Mathematics 2020-12-18 Carlos Ignacio Hernández Castellanos , Sina Ober-Blöbaum , Sebastian Peitz

Many real-world problems require trading off multiple competing objectives. However, these objectives are often in different units and/or scales, which can make it challenging for practitioners to express numerical preferences over…

Many real world problems can be defined as optimisation problems in which the aim is to maximise an objective function. The quality of obtained solution is directly linked to the pertinence of the used objective function. However, designing…

Machine Learning · Computer Science 2012-04-24 Patrick Taillandier , Julien Gaffuri

Optimization has found numerous applications in engineering, particularly since 1960s. Many optimization applications in engineering have more than one objective (or performance criterion). Such applications require multi-objective (or…

Chemical Physics · Physics 2024-07-16 Zhiyuan Wang , Seyed Reza Nabavi , Gade Pandu Rangaiah

In the field of reinforcement learning, because of the high cost and risk of policy training in the real world, policies are trained in a simulation environment and transferred to the corresponding real-world environment. However, the…

Machine Learning · Computer Science 2023-01-12 Takumi Tanabe , Rei Sato , Kazuto Fukuchi , Jun Sakuma , Youhei Akimoto

The off-policy paradigm casts recommendation as a counterfactual decision-making task, allowing practitioners to unbiasedly estimate online metrics using offline data. This leads to effective evaluation metrics, as well as learning…

Machine Learning · Computer Science 2024-09-17 Olivier Jeunen , Aleksei Ustimenko

In this paper, an active control policy design for a fractional order (FO) financial system is attempted, considering multiple conflicting objectives. An active control template as a nonlinear state feedback mechanism is developed and the…

Optimization and Control · Mathematics 2016-11-30 Indranil Pan , Saptarshi Das , Shantanu Das

Off-policy evaluation (OPE) holds the promise of being able to leverage large, offline datasets for both evaluating and selecting complex policies for decision making. The ability to learn offline is particularly important in many…

Off-policy learning is a framework for evaluating and optimizing policies without deploying them, from data collected by another policy. Real-world environments are typically non-stationary and the offline learned policies should adapt to…

Machine Learning · Computer Science 2021-04-06 Joey Hong , Branislav Kveton , Manzil Zaheer , Yinlam Chow , Amr Ahmed

Off-policy evaluation (OPE) is a method for estimating the return of a target policy using some pre-collected observational data generated by a potentially different behavior policy. In some cases, there may be unmeasured variables that can…

Machine Learning · Statistics 2023-02-03 Yang Xu , Jin Zhu , Chengchun Shi , Shikai Luo , Rui Song

The dynamic portfolio optimization problem in finance frequently requires learning policies that adhere to various constraints, driven by investor preferences and risk. We motivate this problem of finding an allocation policy within a…

Artificial Intelligence · Computer Science 2020-12-23 Nymisha Bandi , Theja Tulabandhula

There has been a growing interest in off-policy evaluation in the literature such as recommender systems and personalized medicine. We have so far seen significant progress in developing estimators aimed at accurately estimating the…

Machine Learning · Computer Science 2024-04-24 Yuta Saito , Masahiro Nomura

Effective optimization is essential for interactive systems to provide a satisfactory user experience. However, it is often challenging to find an objective to optimize for. Generally, such objectives are manually crafted and rarely capture…

Artificial Intelligence · Computer Science 2019-12-17 Ziming Li , Julia Kiseleva , Alekh Agarwal , Maarten de Rijke

In many settings, a decision-maker wishes to learn a rule, or policy, that maps from observable characteristics of an individual to an action. Examples include selecting offers, prices, advertisements, or emails to send to consumers, as…

Machine Learning · Statistics 2018-11-20 Zhengyuan Zhou , Susan Athey , Stefan Wager
‹ Prev 1 2 3 10 Next ›