Related papers: Factor Engine: A Python Library for Systematic Fin…

FactorEngine: A Program-level Knowledge-Infused Factor Mining Framework for Quantitative Investment

We study alpha factor mining, the automated discovery of predictive signals from noisy, non-stationary market data-under a practical requirement that mined factors be directly executable and auditable, and that the discovery process remain…

Artificial Intelligence · Computer Science 2026-04-10 Qinhong Lin , Ruitao Feng , Yinglun Feng , Zhenxin Huang , Yukun Chen , Zhongliang Yang , Linna Zhou , Binjie Fei , Jiaqi Liu , Yu Li

Bayesian Quantile Factor Models

Factor analysis is a flexible technique for assessment of multivariate dependence and codependence. Besides being an exploratory tool used to reduce the dimensionality of multivariate data, it allows estimation of common factors that often…

Methodology · Statistics 2020-02-19 Kelly C. M. Gonçalves , Afonso C. B. Silva

FactorMiner: A Self-Evolving Agent with Skills and Experience Memory for Financial Alpha Discovery

Formulaic alpha factor mining is a critical yet challenging task in quantitative investment, characterized by a vast search space and the need for domain-informed, interpretable signals. However, finding novel signals becomes increasingly…

Trading and Market Microstructure · Quantitative Finance 2026-02-17 Yanlong Wang , Jian Xu , Hongkang Zhang , Shao-Lun Huang , Danny Dongning Sun , Xiao-Ping Zhang

The autofeat Python Library for Automated Feature Engineering and Selection

This paper describes the autofeat Python library, which provides scikit-learn style linear regression and classification models with automated feature engineering and selection capabilities. Complex non-linear machine learning models, such…

Machine Learning · Computer Science 2020-02-27 Franziska Horn , Robert Pack , Michael Rieger

scikit-fda: A Python Package for Functional Data Analysis

The library scikit-fda is a Python package for Functional Data Analysis (FDA). It provides a comprehensive set of tools for representation, preprocessing, and exploratory analysis of functional data. The library is built upon and integrated…

Computation · Statistics 2024-09-04 Carlos Ramos-Carreño , José Luis Torrecilla , Miguel Carbajo-Berrocal , Pablo Marcos , Alberto Suárez

Genome-Factory: A Library for Tuning, Deploying, and Interpreting Genomic Foundation Models

We introduce Genome-Factory, the first integrated Python library for tuning, deploying, and interpreting genomic foundation models. Our core contribution is to simplify and unify the workflow for genomic model development: data collection,…

Genomics · Quantitative Biology 2026-05-18 Weimin Wu , Xuefeng Song , Yibo Wen , Qinjie Lin , Zhihan Zhou , Jerry Yao-Chieh Hu , Zhong Wang , Han Liu

FAT Forensics: A Python Toolbox for Implementing and Deploying Fairness, Accountability and Transparency Algorithms in Predictive Systems

Predictive systems, in particular machine learning algorithms, can take important, and sometimes legally binding, decisions about our everyday life. In most cases, however, these systems and decisions are neither regulated nor certified.…

Machine Learning · Computer Science 2022-09-09 Kacper Sokol , Alexander Hepburn , Rafael Poyiadzi , Matthew Clifford , Raul Santos-Rodriguez , Peter Flach

NIMFA: A Python Library for Nonnegative Matrix Factorization

NIMFA is an open-source Python library that provides a unified interface to nonnegative matrix factorization algorithms. It includes implementations of state-of-the-art factorization methods, initialization approaches, and quality scoring.…

Machine Learning · Computer Science 2018-08-07 Marinka Zitnik , Blaz Zupan

MatchPy: A Pattern Matching Library

Pattern matching is a powerful tool for symbolic computations, based on the well-defined theory of term rewriting systems. Application domains include algebraic expressions, abstract syntax trees, and XML and JSON data. Unfortunately, no…

Programming Languages · Computer Science 2017-10-20 Manuel Krebber , Henrik Barthels , Paolo Bientinesi

The Kernel Trick for Nonlinear Factor Modeling

Factor modeling is a powerful statistical technique that permits to capture the common dynamics in a large panel of data with a few latent variables, or factors, thus alleviating the curse of dimensionality. Despite its popularity and…

Econometrics · Economics 2021-03-03 Varlam Kutateladze

Augmenting data-driven models for energy systems through feature engineering: A Python framework for feature engineering

Data-driven modeling is an approach in energy systems modeling that has been gaining popularity. In data-driven modeling, machine learning methods such as linear regression, neural networks or decision-tree based methods are being applied.…

Machine Learning · Computer Science 2023-01-05 Sandra Wilfling

Efficient Pattern Matching in Python

Pattern matching is a powerful tool for symbolic computations. Applications include term rewriting systems, as well as the manipulation of symbolic expressions, abstract syntax trees, and XML and JSON data. It also allows for an intuitive…

Programming Languages · Computer Science 2017-10-09 Manuel Krebber , Henrik Barthels , Paolo Bientinesi

FDB: A Query Engine for Factorised Relational Databases

Factorised databases are relational databases that use compact factorised representations at the physical layer to reduce data redundancy and boost query performance. This paper introduces FDB, an in-memory query engine for…

Databases · Computer Science 2012-03-14 Nurzhan Bakibayev , Dan Olteanu , Jakub Závodný

Functional Programming Paradigm of Python for Scientific Computation Pipeline Integration

The advent of modern data processing has led to an increasing tendency towards interdisciplinarity, which frequently involves the importation of different technical approaches. Consequently, there is an urgent need for a unified data…

Machine Learning · Computer Science 2024-06-04 Chen Zhang , Lecheng Jia , Wei Zhang , Ning Wen

Factor Machine: Mixed-signal Architecture for Fine-Grained Graph-Based Computing

This paper proposes the design and implementation strategy of a novel computing architecture, the Factor Machine. The work is a step towards a general-purpose parallel system operating in a non-sequential manner, exploiting…

Hardware Architecture · Computer Science 2024-02-21 Piotr Dudek

QuantFactor REINFORCE: Mining Steady Formulaic Alpha Factors with Variance-bounded REINFORCE

Alpha factor mining aims to discover investment signals from the historical financial market data, which can be used to predict asset returns and gain excess profits. Powerful deep learning methods for alpha factor mining lack…

Computational Finance · Quantitative Finance 2025-06-18 Junjie Zhao , Chengxi Zhang , Min Qin , Peng Yang

Factor Analysis in Fault Diagnostics Using Random Forest

Factor analysis or sometimes referred to as variable analysis has been extensively used in classification problems for identifying specific factors that are significant to particular classes. This type of analysis has been widely used in…

Machine Learning · Computer Science 2019-05-01 Nagdev Amruthnath , Tarun Gupta

DataSist: A Python-based library for easy data analysis, visualization and modeling

A large amount of data is produced every second from modern information systems such as mobile devices, the world wide web, Internet of Things, social media, etc. Analysis and mining of this massive data requires a lot of advanced tools and…

Machine Learning · Computer Science 2020-01-13 Rising Odegua , Festus Ikpotokin

Deepchecks: A Library for Testing and Validating Machine Learning Models and Data

This paper presents Deepchecks, a Python library for comprehensively validating machine learning models and data. Our goal is to provide an easy-to-use library comprising of many checks related to various types of issues, such as model…

Machine Learning · Computer Science 2022-03-17 Shir Chorev , Philip Tannor , Dan Ben Israel , Noam Bressler , Itay Gabbay , Nir Hutnik , Jonatan Liberman , Matan Perlmutter , Yurii Romanyshyn , Lior Rokach

A Python library for efficient computation of molecular fingerprints

Machine learning solutions are very popular in the field of chemoinformatics, where they have numerous applications, such as novel drug discovery or molecular property prediction. Molecular fingerprints are algorithms commonly used for…

Quantitative Methods · Quantitative Biology 2024-04-01 Michał Szafarczyk , Piotr Ludynia , Przemysław Kukla