Related papers: High Dimensional Human Guided Machine Learning

The effect of different feature selection methods on models created with XGBoost

This study examines the effect that different feature selection methods have on models created with XGBoost, a popular machine learning algorithm with superb regularization methods. It shows that three different ways for reducing the…

Machine Learning · Computer Science 2024-11-12 Jorge Neyra , Vishal B. Siramshetty , Huthaifa I. Ashqar

A Comparative Analysis of XGBoost

XGBoost is a scalable ensemble technique based on gradient boosting that has demonstrated to be a reliable and efficient machine learning challenge solver. This work proposes a practical analysis of how this novel technique works in terms…

Machine Learning · Computer Science 2023-05-05 Candice Bentéjac , Anna Csörgő , Gonzalo Martínez-Muñoz

Human-like machine learning: limitations and suggestions

This paper attempts to address the issues of machine learning in its current implementation. It is known that machine learning algorithms require a significant amount of data for training purposes, whereas recent developments in deep…

Machine Learning · Computer Science 2018-11-16 Georgios Mastorakis

Tabular Data: Deep Learning is Not All You Need

A key element in solving real-life data science problems is selecting the types of models to use. Tree ensemble models (such as XGBoost) are usually recommended for classification and regression problems with tabular data. However, several…

Machine Learning · Computer Science 2021-11-24 Ravid Shwartz-Ziv , Amitai Armon

A Comparison of Modeling Preprocessing Techniques

This paper compares the performance of various data processing methods in terms of predictive performance for structured data. This paper also seeks to identify and recommend preprocessing methodologies for tree-based binary classification…

Methodology · Statistics 2023-02-27 Tosan Johnson , Alice J. Liu , Syed Raza , Aaron McGuire

Dynamic Model Switching for Improved Accuracy in Machine Learning

In the dynamic landscape of machine learning, where datasets vary widely in size and complexity, selecting the most effective model poses a significant challenge. Rather than fixating on a single model, our research propels the field…

Machine Learning · Computer Science 2024-05-01 Syed Tahir Abbas Hasani

A Fair and Efficient Hybrid Federated Learning Framework based on XGBoost for Distributed Power Prediction

In a modern power system, real-time data on power generation/consumption and its relevant features are stored in various distributed parties, including household meters, transformer stations and external organizations. To fully exploit the…

Machine Learning · Computer Science 2022-01-11 Haizhou Liu , Xuan Zhang , Xinwei Shen , Hongbin Sun

Classification Under Human Assistance

Most supervised learning models are trained for full automation. However, their predictions are sometimes worse than those by human experts on some specific instances. Motivated by this empirical observation, our goal is to design…

Machine Learning · Statistics 2021-03-16 Abir De , Nastaran Okati , Ali Zarezade , Manuel Gomez-Rodriguez

An Empirical Analysis of Feature Engineering for Predictive Modeling

Machine learning models, such as neural networks, decision trees, random forests, and gradient boosting machines, accept a feature vector, and provide a prediction. These models learn in a supervised fashion where we provide feature vectors…

Machine Learning · Computer Science 2020-11-03 Jeff Heaton

Can Models Help Us Create Better Models? Evaluating LLMs as Data Scientists

We present a benchmark for large language models designed to tackle one of the most knowledge-intensive tasks in data science: writing feature engineering code, which requires domain knowledge in addition to a deep understanding of the…

Computation and Language · Computer Science 2024-11-01 Michał Pietruszka , Łukasz Borchmann , Aleksander Jędrosz , Paweł Morawiecki

Learned versus Hand-Designed Feature Representations for 3d Agglomeration

For image recognition and labeling tasks, recent results suggest that machine learning methods that rely on manually specified feature representations may be outperformed by methods that automatically derive feature representations based on…

Computer Vision and Pattern Recognition · Computer Science 2013-12-24 John A. Bogovic , Gary B. Huang , Viren Jain

The Duet of Representations and How Explanations Exacerbate It

An algorithm effects a causal representation of relations between features and labels in the human's perception. Such a representation might conflict with the human's prior belief. Explanations can direct the human's attention to the…

Human-Computer Interaction · Computer Science 2024-02-14 Charles Wan , Rodrigo Belo , Leid Zejnilović , Susana Lavado

Cyborg Data: Merging Human with AI Generated Training Data

Automated scoring (AS) systems used in large-scale assessment have traditionally used small statistical models that require a large quantity of hand-scored data to make accurate predictions, which can be time-consuming and costly.…

Machine Learning · Computer Science 2025-04-01 Kai North , Christopher Ormerod

Human-Understandable Decision Making for Visual Recognition

The widespread use of deep neural networks has achieved substantial success in many tasks. However, there still exists a huge gap between the operating mechanism of deep learning models and human-understandable decision making, so that…

Artificial Intelligence · Computer Science 2021-03-08 Xiaowei Zhou , Jie Yin , Ivor Tsang , Chen Wang

Building and Testing a General Intelligence Embodied in a Humanoid Robot

Machines with human-level intelligence should be able to do most economically valuable work. This aligns a major economic incentive with the scientific grand challenge of building a human-like mind. Here we describe our approach to building…

Robotics · Computer Science 2023-08-01 Suzanne Gildert , Geordie Rose

XGenBoost: Synthesizing Small and Large Tabular Datasets with XGBoost

Tree ensembles such as XGBoost are often preferred for discriminative tasks in mixed-type tabular data, due to their inductive biases, minimal hyperparameter tuning, and training efficiency. We argue that these qualities, when leveraged…

Machine Learning · Computer Science 2026-03-10 Jim Achterberg , Marcel Haas , Bram van Dijk , Marco Spruit

A Peek at Peak Emotion Recognition

Despite much progress in the field of facial expression recognition, little attention has been paid to the recognition of peak emotion. Aviezer et al. [1] showed that humans have trouble discerning between positive and negative peak…

Computer Vision and Pattern Recognition · Computer Science 2022-05-23 Tzvi Michelson , Hillel Aviezer , Shmuel Peleg

Towards Goal, Feasibility, and Diversity-Oriented Deep Generative Models in Design

Deep Generative Machine Learning Models (DGMs) have been growing in popularity across the design community thanks to their ability to learn and mimic complex data distributions. DGMs are conventionally trained to minimize statistical…

Machine Learning · Computer Science 2022-06-16 Lyle Regenwetter , Faez Ahmed

Evaluating the Efficacy of Hybrid Deep Learning Models in Distinguishing AI-Generated Text

My research investigates the use of cutting-edge hybrid deep learning models to accurately differentiate between AI-generated text and human writing. I applied a robust methodology, utilising a carefully selected dataset comprising AI and…

Computation and Language · Computer Science 2024-01-17 Abiodun Finbarrs Oketunji

Human vs. supervised machine learning: Who learns patterns faster?

The capabilities of supervised machine learning (SML), especially compared to human abilities, are being discussed in scientific research and in the usage of SML. This study provides an answer to how learning performance differs between…

Artificial Intelligence · Computer Science 2020-12-08 Niklas Kühl , Marc Goutier , Lucas Baier , Clemens Wolff , Dominik Martin