Related papers: CoFEH: LLM-driven Feature Engineering Empowered by…

LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers

Automated feature engineering plays a critical role in improving predictive model performance for tabular learning tasks. Traditional automated feature engineering methods are limited by their reliance on pre-defined transformations within…

Machine Learning · Computer Science 2026-05-12 Nikhil Abhyankar , Parshin Shojaee , Chandan K. Reddy

CoFEE: Reasoning Control for LLM-Based Feature Discovery

Feature discovery from complex unstructured data is fundamentally a reasoning problem: it requires identifying abstractions that are predictive of a target outcome while avoiding leakage, proxies, and post-outcome signals. With the…

Artificial Intelligence · Computer Science 2026-04-24 Maximilian Westermann , Ben Griffin , Aaron Ontoyin Yin , Zakari Salifu , Yagiz Ihlamur , Kelvin Amoaba , Joseph Ternasky , Fuat Alican , Yigit Ihlamur

Large Language Models for Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering

As the field of automated machine learning (AutoML) advances, it becomes increasingly important to incorporate domain knowledge into these systems. We present an approach for doing so by harnessing the power of large language models (LLMs).…

Artificial Intelligence · Computer Science 2023-10-02 Noah Hollmann , Samuel Müller , Frank Hutter

Toward Efficient Automated Feature Engineering

Automated Feature Engineering (AFE) refers to automatically generate and select optimal feature sets for downstream tasks, which has achieved great success in real-world applications. Current AFE methods mainly focus on improving the…

Machine Learning · Computer Science 2022-12-27 Kafeng Wang , Pengyang Wang , Chengzhong xu

From Understanding to Excelling: Template-Free Algorithm Design through Structural-Functional Co-Evolution

Large language models (LLMs) have greatly accelerated the automation of algorithm generation and optimization. However, current methods such as EoH and FunSearch mainly rely on predefined templates and expert-specified functions that focus…

Software Engineering · Computer Science 2025-03-17 Zhe Zhao , Haibin Wen , Pengkun Wang , Ye Wei , Zaixi Zhang , Xi Lin , Fei Liu , Bo An , Hui Xiong , Yang Wang , Qingfu Zhang

Federated Automated Feature Engineering

Automated feature engineering (AutoFE) is used to automatically create new features from original features to improve predictive performance without needing significant human intervention and domain expertise. Many algorithms exist for…

Machine Learning · Computer Science 2025-04-23 Tom Overman , Diego Klabjan

The Semantic Architect: How FEAML Bridges Structured Data and LLMs for Multi-Label Tasks

Existing feature engineering methods based on large language models (LLMs) have not yet been applied to multi-label learning tasks. They lack the ability to model complex label dependencies and are not specifically adapted to the…

Machine Learning · Computer Science 2025-12-18 Wanfu Gao , Zebin He , Jun Gao

Large Language Model Agent as a Mechanical Designer

Conventional mechanical design follows an iterative process in which initial concepts are refined through cycles of expert assessment and resource-intensive Finite Element Method (FEM) analysis to meet performance goals. While machine…

Machine Learning · Computer Science 2025-05-02 Yayati Jadhav , Amir Barati Farimani

Large Language Models as End-to-end Combinatorial Optimization Solvers

Combinatorial optimization (CO) problems, central to decision-making scenarios like logistics and manufacturing, are traditionally solved using problem-specific algorithms requiring significant domain expertise. While large language models…

Artificial Intelligence · Computer Science 2025-09-24 Xia Jiang , Yaoxin Wu , Minshuo Li , Zhiguang Cao , Yingqian Zhang

AutoProteinEngine: A Large Language Model Driven Agent Framework for Multimodal AutoML in Protein Engineering

Protein engineering is important for biomedical applications, but conventional approaches are often inefficient and resource-intensive. While deep learning (DL) models have shown promise, their training or implementation into protein…

Quantitative Methods · Quantitative Biology 2024-11-08 Yungeng Liu , Zan Chen , Yu Guang Wang , Yiqing Shen

A Meta-Knowledge-Augmented LLM Framework for Hyperparameter Optimization in Time-Series Forecasting

Hyperparameter optimization (HPO) plays a central role in the performance of deep learning models, yet remains computationally expensive and difficult to interpret, particularly for time-series forecasting. While Bayesian Optimization (BO)…

Machine Learning · Computer Science 2026-02-17 Ons Saadallah , Mátyás andó , Tamás Gábor Orosz

OpenFE: Automated Feature Generation with Expert-level Performance

The goal of automated feature generation is to liberate machine learning experts from the laborious task of manual feature generation, which is crucial for improving the learning performance of tabular data. The major challenge in automated…

Machine Learning · Computer Science 2023-06-06 Tianping Zhang , Zheyu Zhang , Zhiyuan Fan , Haoyan Luo , Fengyuan Liu , Qian Liu , Wei Cao , Jian Li

IIFE: Interaction Information Based Automated Feature Engineering

Automated feature engineering (AutoFE) is the process of automatically building and selecting new features that help improve downstream predictive performance. While traditional feature engineering requires significant domain expertise and…

Machine Learning · Computer Science 2025-02-28 Tom Overman , Diego Klabjan , Jean Utke

Using Large Language Models for Hyperparameter Optimization

This paper explores the use of foundational large language models (LLMs) in hyperparameter optimization (HPO). Hyperparameters are critical in determining the effectiveness of machine learning models, yet their optimization often relies on…

Machine Learning · Computer Science 2024-11-12 Michael R. Zhang , Nishkrit Desai , Juhan Bae , Jonathan Lorraine , Jimmy Ba

Human-LLM Collaborative Feature Engineering for Tabular Data

Large language models (LLMs) are increasingly used to automate feature engineering in tabular learning. Given task-specific information, LLMs can propose diverse feature transformation operations to enhance downstream model performance.…

Machine Learning · Computer Science 2026-01-30 Zhuoyan Li , Aditya Bansal , Jinzhao Li , Shishuang He , Zhuoran Lu , Mutian Zhang , Qin Liu , Yiwei Yang , Swati Jain , Ming Yin , Yunyao Li

LLaMEA-BO: A Large Language Model Evolutionary Algorithm for Automatically Generating Bayesian Optimization Algorithms

Bayesian optimization (BO) is a powerful class of algorithms for optimizing expensive black-box functions, but designing effective BO algorithms remains a manual, expertise-driven task. Recent advancements in Large Language Models (LLMs)…

Machine Learning · Computer Science 2025-05-28 Wenhu Li , Niki van Stein , Thomas Bäck , Elena Raponi

COHERENT: Collaboration of Heterogeneous Multi-Robot System with Large Language Models

Leveraging the powerful reasoning capabilities of large language models (LLMs), recent LLM-based robot task planning methods yield promising results. However, they mainly focus on single or multiple homogeneous robots on simple tasks.…

Robotics · Computer Science 2025-04-01 Kehui Liu , Zixin Tang , Dong Wang , Zhigang Wang , Xuelong Li , Bin Zhao

OBOE: Collaborative Filtering for AutoML Model Selection

Algorithm selection and hyperparameter tuning remain two of the most challenging tasks in machine learning. Automated machine learning (AutoML) seeks to automate these tasks to enable widespread use of machine learning by non-experts. This…

Machine Learning · Computer Science 2019-05-22 Chengrun Yang , Yuji Akimoto , Dae Won Kim , Madeleine Udell

LLM4CMO: Large Language Model-aided Algorithm Design for Constrained Multiobjective Optimization

Constrained multi-objective optimization problems (CMOPs) frequently arise in real-world applications where multiple conflicting objectives must be optimized under complex constraints. Existing dual-population two-stage algorithms have…

Neural and Evolutionary Computing · Computer Science 2025-10-27 Zhen-Song Chen , Hong-Wei Ding , Xian-Jia Wang , Witold Pedrycz

FeRG-LLM : Feature Engineering by Reason Generation Large Language Models

One of the key tasks in machine learning for tabular data is feature engineering. Although it is vital for improving the performance of models, it demands considerable human expertise and deep domain knowledge, making it labor-intensive…

Computation and Language · Computer Science 2025-04-01 Jeonghyun Ko , Gyeongyun Park , Donghoon Lee , Kyunam Lee