English
Related papers

Related papers: Machine Learning using Stata/Python

200 papers

pystacked implements stacked generalization (Wolpert, 1992) for regression and binary classification via Python's scikit-learn. Stacking combines multiple supervised machine learners -- the "base" or "level-0" learners -- into a single…

Econometrics · Economics 2023-03-07 Achim Ahrens , Christian B. Hansen , Mark E. Schaffer

Machine Learning (ML) and linear System Identification (SI) have been historically developed independently. In this paper, we leverage well-established ML tools - especially the automatic differentiation framework - to introduce SIMBa, a…

Systems and Control · Electrical Eng. & Systems 2024-03-27 Loris Di Natale , Muhammad Zakwan , Bratislav Svetozarevic , Philipp Heer , Giancarlo Ferrari-Trecate , Colin N. Jones

We introduce the package ddml for Double/Debiased Machine Learning (DDML) in Stata. Estimators of causal parameters for five different econometric models are supported, allowing for flexible estimation of causal effects of endogenous…

Econometrics · Economics 2024-01-09 Achim Ahrens , Christian B. Hansen , Mark E. Schaffer , Thomas Wiemann

Intrusion detection systems are crucial for network security. Verification of these systems is complicated by various factors, including the heterogeneity of network platforms and the continuously changing landscape of cyber threats. In…

Cryptography and Security · Computer Science 2024-07-16 Negin Ayoughi , Shiva Nejati , Mehrdad Sabetzadeh , Patricio Saavedra

In this paper, we present MELT-ML, a machine learning extension to the Matching and EvaLuation Toolkit (MELT) which facilitates the application of supervised learning for ontology and instance matching. Our contributions are twofold: We…

Artificial Intelligence · Computer Science 2020-09-24 Sven Hertling , Jan Portisch , Heiko Paulheim

Machine learning (ML) offers powerful methods for detecting and modeling associations often in data with large feature spaces and complex associations. Many useful tools/packages (e.g. scikit-learn) have been developed to make the various…

Machine Learning · Computer Science 2022-06-27 Ryan J. Urbanowicz , Robert Zhang , Yuhan Cui , Pranshu Suri

This paper presents the first application of spiking neural networks (SNNs) for the classification of chronic lower back pain (CLBP) using the EmoPain dataset. Our work has two main contributions. We introduce Spike Threshold Adaptive…

Machine Learning · Computer Science 2024-07-12 Freek Hens , Mohammad Mahdi Dehshibi , Leila Bagheriye , Mahyar Shahsavari , Ana Tajadura-Jiménez

Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a…

Language models often show little to no improvement (i.e., "saturation") when trained via vanilla supervised fine-tuning (SFT) on data similar to what they saw in their training set (e.g., MATH). We introduce a new fine-tuning strategy,…

Machine Learning · Computer Science 2025-10-14 Yinghui He , Abhishek Panigrahi , Yong Lin , Sanjeev Arora

In this paper, we aim at tackling a general but interesting cross-modality feature learning question in remote sensing community --- can a limited amount of highly-discrimin-ative (e.g., hyperspectral) training data improve the performance…

Computer Vision and Pattern Recognition · Computer Science 2019-12-19 Danfeng Hong , Naoto Yokoya , Nan Ge , Jocelyn Chanussot , Xiao Xiang Zhu

This research conducted a systematic review of the literature on machine learning (ML)-based methods in the context of Continuous Integration (CI) over the past 22 years. The study aimed to identify and describe the techniques used in…

Software Engineering · Computer Science 2023-07-18 Ali Kazemi Arani , Triet Huynh Minh Le , Mansooreh Zahedi , Muhammad Ali Babar

In this paper we present MLaut (Machine Learning AUtomation Toolbox) for the python data science ecosystem. MLaut automates large-scale evaluation and benchmarking of machine learning algorithms on a large number of datasets. MLaut provides…

Machine Learning · Computer Science 2019-01-14 Viktor Kazakov , Franz J. Király

Self-training via pseudo labeling is a conventional, simple, and popular pipeline to leverage unlabeled data. In this work, we first construct a strong baseline of self-training (namely ST) for semi-supervised semantic segmentation via…

Computer Vision and Pattern Recognition · Computer Science 2022-03-04 Lihe Yang , Wei Zhuo , Lei Qi , Yinghuan Shi , Yang Gao

This paper presents the Stata community-distributed command "opl_ma_fb" (and the companion command "opl_ma_vf"), for implementing the first-best Optimal Policy Learning (OPL) algorithm to estimate the best treatment assignment given the…

Econometrics · Economics 2025-09-09 Giovanni Cerulli

As data science and machine learning methods are taking on an increasingly important role in the materials research community, there is a need for the development of machine learning software tools that are easy to use (even for nonexperts…

Computational Physics · Physics 2020-06-26 Ryan Jacobs , Tam Mayeshiba , Ben Afflerbach , Luke Miles , Max Williams , Matthew Turner , Raphael Finkel , Dane Morgan

Choosing an appropriate strategy for partitioning data into training and evaluation sets is a critical step in machine learning, yet validation methods are often selected using default or conventional settings without considering their…

Machine Learning · Computer Science 2026-01-05 Zahra Bami , Ali Behnampour , Aniruddha Bora , Hassan Doosti

Regression methods based in Machine Learning Algorithms (MLA) have become an important tool for data analysis in many different disciplines. In this work, we use MLA in an astrophysical context; our goal is to measure the mean longitudinal…

Solar and Stellar Astrophysics · Physics 2018-11-07 J. C. Ramirez-Velez , C. Yañez-Marquez , J. P. Cordova-Barbosa

This paper proposes a novel sparse principal component analysis algorithm with self-learning ability for successive modes, where synaptic intelligence is employed to measure the importance of variables and a regularization term is added to…

Machine Learning · Computer Science 2021-08-10 Jingxin Zhang , Donghua Zhou , Maoyin Chen

We introduce Logic Guided Machine Learning (LGML), a novel approach that symbiotically combines machine learning (ML) and logic solvers with the goal of learning mathematical functions from data. LGML consists of two phases, namely a…

Artificial Intelligence · Computer Science 2021-03-31 Joseph Scott , Maysum Panju , Vijay Ganesh

Machine learning (ML) is increasingly adopted in scientific research, yet the quality and reliability of results often depend on how experiments are designed and documented. Poor baselines, inconsistent preprocessing, or insufficient…

Machine Learning · Computer Science 2025-12-01 Umberto Michelucci , Francesca Venturini
‹ Prev 1 2 3 10 Next ›