Related papers: Closed-Form Test Functions for Biophysical Sequenc…
Although large language models (LLMs) have shown promise in biomolecule optimization problems, they incur heavy computational costs and struggle to satisfy precise constraints. On the other hand, specialized solvers like LaMBO-2 offer…
Accurate force fields are necessary for predictive molecular simulations. However, developing force fields that accurately reproduce experimental properties is challenging. Here, we present a machine learning directed, multiobjective…
This paper presents a comprehensive evaluation of cost-efficient Large Language Models (LLMs) for diverse biomedical tasks spanning both text and image modalities. We evaluated a range of closed-source and open-source LLMs on tasks such as…
We have employed a recent implementation of genetic algorithms to study a range of standard benchmark functions for global optimization. It turns out that some of them are not very useful as challenging test functions, since they neither…
Optimizing discrete black-box functions is key in several domains, e.g. protein engineering and drug design. Due to the lack of gradient information and the need for sample efficiency, Bayesian optimization is an ideal candidate for these…
Feature Transformation (FT) is a core data-centric AI task that improves feature space quality to advance downstream predictive performance. However, discovering effective transformations remains challenging due to the large space of…
Pretraining DNA language models (DNALMs) on the full human genome is resource-intensive, yet often considered necessary for strong downstream performance. Inspired by recent findings in NLP and long-context modeling, we explore an…
Test functions are important to validate and compare the performance of optimization algorithms. There have been many test or benchmark functions reported in the literature; however, there is no standard list or set of benchmark functions.…
Constructing first-principles models is usually a challenging and time-consuming task due to the complexity of the real-life processes. On the other hand, data-driven modeling, and in particular neural network models often suffer from…
While current benchmark reinforcement learning (RL) tasks have been useful to drive progress in the field, they are in many ways poor substitutes for learning with real-world data. By testing increasingly complex RL algorithms on…
Deep neural networks, when optimized with sufficient data, provide accurate representations of high-dimensional functions; in contrast, function approximation techniques that have predominated in scientific computing do not scale well with…
We develop a combined machine learning (ML) and quantum mechanics approach that enables data-efficient reconstruction of flexible molecular force fields from high-level ab initio calculations, through the consideration of fundamental…
Machine Learning (ML) has widely been used for modeling and predicting physical systems. These techniques offer high expressive power and good generalizability for interpolation within observed data sets. However, the disadvantage of…
To optimize efficiently over discrete data and with only few available target observations is a challenge in Bayesian optimization. We propose a continuous relaxation of the objective function and show that inference and optimization can be…
A good feature representation is a determinant factor to achieve high performance for many machine learning algorithms in terms of classification. This is especially true for techniques that do not build complex internal representations of…
Real world experiments are expensive, and thus it is important to reach a target in minimum number of experiments. Experimental processes often involve control variables that changes over time. Such problems can be formulated as a…
Many key problems in machine learning and data science are routinely modeled as optimization problems and solved via optimization algorithms. With the increase of the volume of data and the size and complexity of the statistical models used…
This paper studies simple bilevel problems, where a convex upper-level function is minimized over the optimal solutions of a convex lower-level problem. We first show the fundamental difficulty of simple bilevel problems, that the…
Large language models (LLMs) achieve impressive results over various tasks, and ever-expanding public repositories contain an abundance of pre-trained models. Therefore, identifying the best-performing LLM for a given task is a significant…
Feature selection for a given model can be transformed into an optimization task. The essential idea behind it is to find the most suitable subset of features according to some criterion. Nature-inspired optimization can mitigate this…