Related papers: Machine Learning using Stata/Python

pystacked: Stacking generalization and machine learning in Stata

pystacked implements stacked generalization (Wolpert, 1992) for regression and binary classification via Python's scikit-learn. Stacking combines multiple supervised machine learners -- the "base" or "level-0" learners -- into a single…

Econometrics · Economics 2023-03-07 Achim Ahrens , Christian B. Hansen , Mark E. Schaffer

Stable Linear Subspace Identification: A Machine Learning Approach

Machine Learning (ML) and linear System Identification (SI) have been historically developed independently. In this paper, we leverage well-established ML tools - especially the automatic differentiation framework - to introduce SIMBa, a…

Systems and Control · Electrical Eng. & Systems 2024-03-27 Loris Di Natale , Muhammad Zakwan , Bratislav Svetozarevic , Philipp Heer , Giancarlo Ferrari-Trecate , Colin N. Jones

ddml: Double/debiased machine learning in Stata

We introduce the package ddml for Double/Debiased Machine Learning (DDML) in Stata. Estimators of causal parameters for five different econometric models are supported, allowing for flexible estimation of causal effects of endogenous…

Econometrics · Economics 2024-01-09 Achim Ahrens , Christian B. Hansen , Mark E. Schaffer , Thomas Wiemann

Enhancing Automata Learning with Statistical Machine Learning: A Network Security Case Study

Intrusion detection systems are crucial for network security. Verification of these systems is complicated by various factors, including the heterogeneity of network platforms and the continuously changing landscape of cyber threats. In…

Cryptography and Security · Computer Science 2024-07-16 Negin Ayoughi , Shiva Nejati , Mehrdad Sabetzadeh , Patricio Saavedra

Supervised Ontology and Instance Matching with MELT

In this paper, we present MELT-ML, a machine learning extension to the Matching and EvaLuation Toolkit (MELT) which facilitates the application of supervised learning for ontology and instance matching. Our contributions are twofold: We…

Artificial Intelligence · Computer Science 2020-09-24 Sven Hertling , Jan Portisch , Heiko Paulheim

STREAMLINE: A Simple, Transparent, End-To-End Automated Machine Learning Pipeline Facilitating Data Analysis and Algorithm Comparison

Machine learning (ML) offers powerful methods for detecting and modeling associations often in data with large feature spaces and complex associations. Many useful tools/packages (e.g. scikit-learn) have been developed to make the various…

Machine Learning · Computer Science 2022-06-27 Ryan J. Urbanowicz , Robert Zhang , Yuhan Cui , Pranshu Suri

STAL: Spike Threshold Adaptive Learning Encoder for Classification of Pain-Related Biosignal Data

This paper presents the first application of spiking neural networks (SNNs) for the classification of chronic lower back pain (CLBP) using the EmoPain dataset. Our work has two main contributions. We introduce Spike Threshold Adaptive…

Machine Learning · Computer Science 2024-07-12 Freek Hens , Mohammad Mahdi Dehshibi , Leila Bagheriye , Mahyar Shahsavari , Ana Tajadura-Jiménez

Scikit-learn: Machine Learning in Python

Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a…

Machine Learning · Computer Science 2018-06-06 Fabian Pedregosa , Gaël Varoquaux , Alexandre Gramfort , Vincent Michel , Bertrand Thirion , Olivier Grisel , Mathieu Blondel , Andreas Müller , Joel Nothman , Gilles Louppe , Peter Prettenhofer , Ron Weiss , Vincent Dubourg , Jake Vanderplas , Alexandre Passos , David Cournapeau , Matthieu Brucher , Matthieu Perrot , Édouard Duchesnay

Skill-Targeted Adaptive Training

Language models often show little to no improvement (i.e., "saturation") when trained via vanilla supervised fine-tuning (SFT) on data similar to what they saw in their training set (e.g., MATH). We introduce a new fine-tuning strategy,…

Machine Learning · Computer Science 2025-10-14 Yinghui He , Abhishek Panigrahi , Yong Lin , Sanjeev Arora

Learnable Manifold Alignment (LeMA) : A Semi-supervised Cross-modality Learning Framework for Land Cover and Land Use Classification

In this paper, we aim at tackling a general but interesting cross-modality feature learning question in remote sensing community --- can a limited amount of highly-discrimin-ative (e.g., hyperspectral) training data improve the performance…

Computer Vision and Pattern Recognition · Computer Science 2019-12-19 Danfeng Hong , Naoto Yokoya , Nan Ge , Jocelyn Chanussot , Xiao Xiang Zhu

Systematic Literature Review on Application of Machine Learning in Continuous Integration

This research conducted a systematic review of the literature on machine learning (ML)-based methods in the context of Continuous Integration (CI) over the past 22 years. The study aimed to identify and describe the techniques used in…

Software Engineering · Computer Science 2023-07-18 Ali Kazemi Arani , Triet Huynh Minh Le , Mansooreh Zahedi , Muhammad Ali Babar

Machine Learning Automation Toolbox (MLaut)

In this paper we present MLaut (Machine Learning AUtomation Toolbox) for the python data science ecosystem. MLaut automates large-scale evaluation and benchmarking of machine learning algorithms on a large number of datasets. MLaut provides…

Machine Learning · Computer Science 2019-01-14 Viktor Kazakov , Franz J. Király

ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation

Self-training via pseudo labeling is a conventional, simple, and popular pipeline to leverage unlabeled data. In this work, we first construct a strong baseline of self-training (namely ST) for semi-supervised semantic segmentation via…

Computer Vision and Pattern Recognition · Computer Science 2022-03-04 Lihe Yang , Wei Zhuo , Lei Qi , Yinghuan Shi , Yang Gao

Optimal Policy Learning for Multi-Action Treatment with Risk Preference using Stata

This paper presents the Stata community-distributed command "opl_ma_fb" (and the companion command "opl_ma_vf"), for implementing the first-best Optimal Policy Learning (OPL) algorithm to estimate the best treatment assignment given the…

Econometrics · Economics 2025-09-09 Giovanni Cerulli

The Materials Simulation Toolkit for Machine Learning (MAST-ML): an automated open source toolkit to accelerate data-driven materials research

As data science and machine learning methods are taking on an increasingly important role in the materials research community, there is a need for the development of machine learning software tools that are easy to use (even for nonexperts…

Computational Physics · Physics 2020-06-26 Ryan Jacobs , Tam Mayeshiba , Ben Afflerbach , Luke Miles , Max Williams , Matthew Turner , Raphael Finkel , Dane Morgan

A New Flexible Train-Test Split Algorithm, an approach for choosing among the Hold-out, K-fold cross-validation, and Hold-out iteration

Choosing an appropriate strategy for partitioning data into training and evaluation sets is a critical step in machine learning, yet validation methods are often selected using default or conventional settings without considering their…

Machine Learning · Computer Science 2026-01-05 Zahra Bami , Ali Behnampour , Aniruddha Bora , Hassan Doosti

On the use of machine learning algorithms in the measurement of stellar magnetic fields

Regression methods based in Machine Learning Algorithms (MLA) have become an important tool for data analysis in many different disciplines. In this work, we use MLA in an astrophysical context; our goal is to measure the mean longitudinal…

Solar and Stellar Astrophysics · Physics 2018-11-07 J. C. Ramirez-Velez , C. Yañez-Marquez , J. P. Cordova-Barbosa

Self-learning sparse PCA for multimode process monitoring

This paper proposes a novel sparse principal component analysis algorithm with self-learning ability for successive modes, where synaptic intelligence is employed to measure the importance of variables and a regularization term is added to…

Machine Learning · Computer Science 2021-08-10 Jingxin Zhang , Donghua Zhou , Maoyin Chen

LGML: Logic Guided Machine Learning

We introduce Logic Guided Machine Learning (LGML), a novel approach that symbiotically combines machine learning (ML) and logic solvers with the goal of learning mathematical functions from data. LGML consists of two phases, namely a…

Artificial Intelligence · Computer Science 2021-03-31 Joseph Scott , Maysum Panju , Vijay Ganesh

Best Practices for Machine Learning Experimentation in Scientific Applications

Machine learning (ML) is increasingly adopted in scientific research, yet the quality and reliability of results often depend on how experiments are designed and documented. Poor baselines, inconsistent preprocessing, or insufficient…

Machine Learning · Computer Science 2025-12-01 Umberto Michelucci , Francesca Venturini