Related papers: Complex Structure Leads to Overfitting: A Structur…

Structure Regularization for Structured Prediction: Theories and Experiments

While there are many studies on weight regularization, the study on structure regularization is rare. Many existing systems on structured prediction focus on increasing the level of structural dependencies within the model. However, this…

Machine Learning · Computer Science 2015-02-02 Xu Sun

Structure-Aware Decoding Mechanisms for Complex Entity Extraction with Large-Scale Language Models

This paper proposes a structure-aware decoding method based on large language models to address the difficulty of traditional approaches in maintaining both semantic integrity and structural consistency in nested and overlapping entity…

Computation and Language · Computer Science 2026-01-29 Zhimin Qiu , Di Wu , Feng Liu , Yuxiao Wang

Regularization via Structural Label Smoothing

Regularization is an effective way to promote the generalization performance of machine learning models. In this paper, we focus on label smoothing, a form of output distribution regularization that prevents overfitting of a neural network…

Machine Learning · Computer Science 2020-07-07 Weizhi Li , Gautam Dasarathy , Visar Berisha

Density Fixing: Simple yet Effective Regularization Method based on the Class Prior

Machine learning models suffer from overfitting, which is caused by a lack of labeled data. To tackle this problem, we proposed a framework of regularization methods, called density-fixing, that can be used commonly for supervised and…

Machine Learning · Computer Science 2020-09-08 Masanari Kimura , Ryohei Izawa

Self-supervised debiasing using low rank regularization

Spurious correlations can cause strong biases in deep neural networks, impairing generalization ability. While most existing debiasing methods require full supervision on either spurious attributes or target labels, training a debiased…

Machine Learning · Computer Science 2023-10-10 Geon Yeong Park , Chanyong Jung , Sangmin Lee , Jong Chul Ye , Sang Wan Lee

Learning by stochastic serializations

Complex structures are typical in machine learning. Tailoring learning algorithms for every structure requires an effort that may be saved by defining a generic learning procedure adaptive to any complex structure. In this paper, we propose…

Machine Learning · Computer Science 2019-05-28 Pablo Strasser , Stephane Armand , Stephane Marchand-Maillet , Alexandros Kalousis

ConsistentFeature: A Plug-and-Play Component for Neural Network Regularization

Over-parameterized neural network models often lead to significant performance discrepancies between training and test sets, a phenomenon known as overfitting. To address this, researchers have proposed numerous regularization techniques…

Machine Learning · Computer Science 2025-01-27 RuiZhe Jiang , Haotian Lei

SoftTarget Regularization: An Effective Technique to Reduce Over-Fitting in Neural Networks

Deep neural networks are learning models with a very high capacity and therefore prone to over-fitting. Many regularization techniques such as Dropout, DropConnect, and weight decay all attempt to solve the problem of over-fitting by…

Machine Learning · Computer Science 2016-12-06 Armen Aghajanyan

Cross-regularization: Adaptive Model Complexity through Validation Gradients

Model regularization requires extensive manual tuning to balance complexity against overfitting. Cross-regularization resolves this tradeoff by directly adapting regularization parameters through validation gradients during training. The…

Machine Learning · Computer Science 2025-06-25 Carlos Stein Brito

Structured Probabilistic Coding

This paper presents a new supervised representation learning framework, namely structured probabilistic coding (SPC), to learn compact and informative representations from input related to the target task. SPC is an encoder-only…

Computation and Language · Computer Science 2024-05-03 Dou Hu , Lingwei Wei , Yaxin Liu , Wei Zhou , Songlin Hu

Exploring Structural Degradation in Dense Representations for Self-supervised Learning

In this work, we observe a counterintuitive phenomenon in self-supervised learning (SSL): longer training may impair the performance of dense prediction tasks (e.g., semantic segmentation). We refer to this phenomenon as Self-supervised…

Computer Vision and Pattern Recognition · Computer Science 2025-10-21 Siran Dai , Qianqian Xu , Peisong Wen , Yang Liu , Qingming Huang

Reducing Overfitting in Deep Networks by Decorrelating Representations

One major challenge in training Deep Neural Networks is preventing overfitting. Many techniques such as data augmentation and novel regularizers such as Dropout have been proposed to prevent overfitting without requiring a massive amount of…

Machine Learning · Computer Science 2016-06-13 Michael Cogswell , Faruk Ahmed , Ross Girshick , Larry Zitnick , Dhruv Batra

Sequence Length is a Domain: Length-based Overfitting in Transformer Models

Transformer-based sequence-to-sequence architectures, while achieving state-of-the-art results on a large number of NLP tasks, can still suffer from overfitting during training. In practice, this is usually countered either by applying…

Computation and Language · Computer Science 2022-01-04 Dušan Variš , Ondřej Bojar

Structured Reasoning for Large Language Models

Large language models (LLMs) achieve strong performance by generating long chains of thought, but longer traces always introduce redundant or ineffective reasoning steps. One typical behavior is that they often perform unnecessary…

Computation and Language · Computer Science 2026-01-13 Jinyi Han , Zixiang Di , Zishang Jiang , Ying Liao , Jiaqing Liang , Yongqi Wang , Yanghua Xiao

Exploiting the Full Capacity of Deep Neural Networks while Avoiding Overfitting by Targeted Sparsity Regularization

Overfitting is one of the most common problems when training deep neural networks on comparatively small datasets. Here, we demonstrate that neural network activation sparsity is a reliable indicator for overfitting which we utilize to…

Machine Learning · Computer Science 2020-02-24 Karim Huesmann , Soeren Klemm , Lars Linsen , Benjamin Risse

Avoiding Overfitting: A Survey on Regularization Methods for Convolutional Neural Networks

Several image processing tasks, such as image classification and object detection, have been significantly improved using Convolutional Neural Networks (CNN). Like ResNet and EfficientNet, many architectures have achieved outstanding…

Computer Vision and Pattern Recognition · Computer Science 2022-01-11 Claudio Filipi Gonçalves dos Santos , João Paulo Papa

Deep Neural Networks Regularization for Structured Output Prediction

A deep neural network model is a powerful framework for learning representations. Usually, it is used to learn the relation $x \to y$ by exploiting the regularities in the input $x$. In structured output prediction problems, $y$ is…

Machine Learning · Computer Science 2017-10-31 Soufiane Belharbi , Romain Hérault , Clément Chatelain , Sébastien Adam

Improving Structured Text Recognition with Regular Expression Biasing

We study the problem of recognizing structured text, i.e. text that follows certain formats, and propose to improve the recognition accuracy of structured text by specifying regular expressions (regexes) for biasing. A biased recognizer…

Computer Vision and Pattern Recognition · Computer Science 2021-11-15 Baoguang Shi , Wenfeng Cheng , Yijuan Lu , Cha Zhang , Dinei Florencio

Unleashing the True Potential of Sequence-to-Sequence Models for Sequence Tagging and Structure Parsing

Sequence-to-Sequence (S2S) models have achieved remarkable success on various text generation tasks. However, learning complex structures with S2S models remains challenging as external neural modules and additional lexicons are often…

Computation and Language · Computer Science 2023-02-07 Han He , Jinho D. Choi

Learning with Structured Sparsity

This paper investigates a new learning formulation called structured sparsity, which is a natural extension of the standard sparsity concept in statistical learning and compressive sensing. By allowing arbitrary structures on the feature…

Methodology · Statistics 2009-05-05 Junzhou Huang , Tong Zhang , Dimitris Metaxas