English
Related papers

Related papers: Numeric Encoding Options with Automunge

200 papers

The Automunge open source python library platform for tabular data pre-processing automates feature engineering data transformations of numerical encoding and missing data infill to received tidy data on bases fit to properties of columns…

Machine Learning · Computer Science 2022-02-24 Nicholas J. Teague

Automunge is a tabular preprocessing library that encodes dataframes for supervised learning. When selecting a default feature encoding strategy for gradient boosted learning, one may consider metrics of training duration and achieved…

Machine Learning · Computer Science 2022-10-27 Nicholas J. Teague

Numerical preprocessing remains an important component of tabular deep learning, where the representation of continuous features can strongly affect downstream performance. Although its importance is well established for classical…

Machine Learning · Computer Science 2026-04-08 Manish Kumar , Anton Frederik Thielmann , Christoph Weisser , Benjamin Säfken

Tabular deep-learning methods require embedding numerical and categorical input features into high-dimensional spaces before processing them. Existing methods deal with this heterogeneous nature of tabular data by employing separate…

Machine Learning · Computer Science 2025-02-18 Boshko Koloski , Andrei Margeloiu , Xiangjian Jiang , Blaž Škrlj , Nikola Simidjievski , Mateja Jamnik

Machine Translation models are trained to translate a variety of documents from one language into another. However, models specifically trained for a particular characteristics of the documents tend to perform better. Fine-tuning is a…

Computation and Language · Computer Science 2019-10-09 Alberto Poncelas , Gideon Maillette de Buy Wenniger , Andy Way

Symbolic indefinite integration in Computer Algebra Systems such as Maple involves selecting the most effective algorithm from multiple available methods. Not all methods will succeed for a given problem, and when several do, the results,…

Symbolic Computation · Computer Science 2025-08-11 Rashid Barket , Matthew England , Jürgen Gerhard

Computer Algebra Systems (e.g. Maple) are used in research, education, and industrial settings. One of their key functionalities is symbolic integration, where there are many sub-algorithms to choose from that can affect the form of the…

Machine Learning · Computer Science 2024-04-24 Rashid Barket , Matthew England , Jürgen Gerhard

The goal of neuro-symbolic AI is to integrate symbolic and subsymbolic AI approaches, to overcome the limitations of either. Prominent systems include Logic Tensor Networks (LTN) or DeepProbLog, which offer neural predicates and end-to-end…

Artificial Intelligence · Computer Science 2025-06-18 Stephen Roth , Lennart Baur , Derian Boer , Stefan Kramer

Tabular data remain a dominant form of real-world information but pose persistent challenges for deep learning due to heterogeneous feature types, lack of natural structure, and limited label-preserving augmentations. As a result, ensemble…

Machine Learning · Computer Science 2025-09-23 Sivan Sarafian , Yehudit Aperstein

Transformers are increasingly employed for graph data, demonstrating competitive performance in diverse tasks. To incorporate graph information into these models, it is essential to enhance node and edge features with positional encodings.…

Rapid progress in deep learning is leading to a diverse set of quickly changing models, with a dramatically growing demand for compute. However, as frameworks specialize performance optimization to patterns in popular networks, they…

Machine Learning · Computer Science 2022-08-31 Oliver Rausch , Tal Ben-Nun , Nikoli Dryden , Andrei Ivanov , Shigang Li , Torsten Hoefler

Tabular data learning has extensive applications in deep learning but its existing embedding techniques are limited in numerical and categorical features such as the inability to capture complex relationships and engineering. This paper…

Machine Learning · Computer Science 2024-09-02 Yuqian Wu , Hengyi Luo , Raymond S. T. Lee

Rapid technological advances are inherently linked to the increased amount of data, a substantial portion of which can be interpreted as data stream, capable of exhibiting the phenomenon of concept drift and having a high imbalance ratio.…

Machine Learning · Computer Science 2024-04-25 Paweł Zyblewski

Deep learning has revolutionized many industries by enabling models to automatically learn complex patterns from raw data, reducing dependence on manual feature engineering. However, deep learning algorithms are sensitive to input data, and…

Machine Learning · Computer Science 2025-07-21 Mert Sehri , Zehui Hua , Francisco de Assis Boldt , Patrick Dumond

Many organizations rely on data from government and third-party sources, and those sources rarely follow the same data formatting. This introduces challenges in integrating data from multiple sources or aligning external sources with…

Databases · Computer Science 2023-12-27 Arash Dargahi Nobari , Davood Rafiei

Automated Machine Learning (AutoML) is a promising direction for democratizing AI by automatically deploying Machine Learning systems with minimal human expertise. The core technical challenge behind AutoML is optimizing the pipelines of…

Machine Learning · Computer Science 2023-05-26 Sebastian Pineda Arango , Josif Grabocka

Examining the effect of different encoding techniques on entity and context embeddings, the goal of this work is to challenge commonly used Ordinal encoding for tabular learning. Applying different preprocessing methods and network…

Machine Learning · Computer Science 2024-03-29 Fredy Reusser

Complex image processing and computer vision systems often consist of a processing pipeline of functional modules. We intend to replace parts or all of a target pipeline with deep neural networks to achieve benefits such as increased…

Computer Vision and Pattern Recognition · Computer Science 2019-02-19 Kilho Son , Jesse Hostetler , Sek Chai

Neural topic models can augment or replace bag-of-words inputs with the learned representations of deep pre-trained transformer-based word prediction models. One added benefit when using representations from multilingual models is that they…

Computation and Language · Computer Science 2021-04-13 Aaron Mueller , Mark Dredze

In the past decade, the field of quantum machine learning has drawn significant attention due to the prospect of bringing genuine computational advantages to now widespread algorithmic methods. However, not all domains of machine learning…

‹ Prev 1 2 3 10 Next ›