English
Related papers

Related papers: Complex Structure Leads to Overfitting: A Structur…

200 papers

While there are many studies on weight regularization, the study on structure regularization is rare. Many existing systems on structured prediction focus on increasing the level of structural dependencies within the model. However, this…

Machine Learning · Computer Science 2015-02-02 Xu Sun

This paper proposes a structure-aware decoding method based on large language models to address the difficulty of traditional approaches in maintaining both semantic integrity and structural consistency in nested and overlapping entity…

Computation and Language · Computer Science 2026-01-29 Zhimin Qiu , Di Wu , Feng Liu , Yuxiao Wang

Regularization is an effective way to promote the generalization performance of machine learning models. In this paper, we focus on label smoothing, a form of output distribution regularization that prevents overfitting of a neural network…

Machine Learning · Computer Science 2020-07-07 Weizhi Li , Gautam Dasarathy , Visar Berisha

Machine learning models suffer from overfitting, which is caused by a lack of labeled data. To tackle this problem, we proposed a framework of regularization methods, called density-fixing, that can be used commonly for supervised and…

Machine Learning · Computer Science 2020-09-08 Masanari Kimura , Ryohei Izawa

Spurious correlations can cause strong biases in deep neural networks, impairing generalization ability. While most existing debiasing methods require full supervision on either spurious attributes or target labels, training a debiased…

Machine Learning · Computer Science 2023-10-10 Geon Yeong Park , Chanyong Jung , Sangmin Lee , Jong Chul Ye , Sang Wan Lee

Complex structures are typical in machine learning. Tailoring learning algorithms for every structure requires an effort that may be saved by defining a generic learning procedure adaptive to any complex structure. In this paper, we propose…

Machine Learning · Computer Science 2019-05-28 Pablo Strasser , Stephane Armand , Stephane Marchand-Maillet , Alexandros Kalousis

Over-parameterized neural network models often lead to significant performance discrepancies between training and test sets, a phenomenon known as overfitting. To address this, researchers have proposed numerous regularization techniques…

Machine Learning · Computer Science 2025-01-27 RuiZhe Jiang , Haotian Lei

Deep neural networks are learning models with a very high capacity and therefore prone to over-fitting. Many regularization techniques such as Dropout, DropConnect, and weight decay all attempt to solve the problem of over-fitting by…

Machine Learning · Computer Science 2016-12-06 Armen Aghajanyan

Model regularization requires extensive manual tuning to balance complexity against overfitting. Cross-regularization resolves this tradeoff by directly adapting regularization parameters through validation gradients during training. The…

Machine Learning · Computer Science 2025-06-25 Carlos Stein Brito

This paper presents a new supervised representation learning framework, namely structured probabilistic coding (SPC), to learn compact and informative representations from input related to the target task. SPC is an encoder-only…

Computation and Language · Computer Science 2024-05-03 Dou Hu , Lingwei Wei , Yaxin Liu , Wei Zhou , Songlin Hu

In this work, we observe a counterintuitive phenomenon in self-supervised learning (SSL): longer training may impair the performance of dense prediction tasks (e.g., semantic segmentation). We refer to this phenomenon as Self-supervised…

Computer Vision and Pattern Recognition · Computer Science 2025-10-21 Siran Dai , Qianqian Xu , Peisong Wen , Yang Liu , Qingming Huang

One major challenge in training Deep Neural Networks is preventing overfitting. Many techniques such as data augmentation and novel regularizers such as Dropout have been proposed to prevent overfitting without requiring a massive amount of…

Machine Learning · Computer Science 2016-06-13 Michael Cogswell , Faruk Ahmed , Ross Girshick , Larry Zitnick , Dhruv Batra

Transformer-based sequence-to-sequence architectures, while achieving state-of-the-art results on a large number of NLP tasks, can still suffer from overfitting during training. In practice, this is usually countered either by applying…

Computation and Language · Computer Science 2022-01-04 Dušan Variš , Ondřej Bojar

Large language models (LLMs) achieve strong performance by generating long chains of thought, but longer traces always introduce redundant or ineffective reasoning steps. One typical behavior is that they often perform unnecessary…

Computation and Language · Computer Science 2026-01-13 Jinyi Han , Zixiang Di , Zishang Jiang , Ying Liao , Jiaqing Liang , Yongqi Wang , Yanghua Xiao

Overfitting is one of the most common problems when training deep neural networks on comparatively small datasets. Here, we demonstrate that neural network activation sparsity is a reliable indicator for overfitting which we utilize to…

Machine Learning · Computer Science 2020-02-24 Karim Huesmann , Soeren Klemm , Lars Linsen , Benjamin Risse

Several image processing tasks, such as image classification and object detection, have been significantly improved using Convolutional Neural Networks (CNN). Like ResNet and EfficientNet, many architectures have achieved outstanding…

Computer Vision and Pattern Recognition · Computer Science 2022-01-11 Claudio Filipi Gonçalves dos Santos , João Paulo Papa

A deep neural network model is a powerful framework for learning representations. Usually, it is used to learn the relation $x \to y$ by exploiting the regularities in the input $x$. In structured output prediction problems, $y$ is…

Machine Learning · Computer Science 2017-10-31 Soufiane Belharbi , Romain Hérault , Clément Chatelain , Sébastien Adam

We study the problem of recognizing structured text, i.e. text that follows certain formats, and propose to improve the recognition accuracy of structured text by specifying regular expressions (regexes) for biasing. A biased recognizer…

Computer Vision and Pattern Recognition · Computer Science 2021-11-15 Baoguang Shi , Wenfeng Cheng , Yijuan Lu , Cha Zhang , Dinei Florencio

Sequence-to-Sequence (S2S) models have achieved remarkable success on various text generation tasks. However, learning complex structures with S2S models remains challenging as external neural modules and additional lexicons are often…

Computation and Language · Computer Science 2023-02-07 Han He , Jinho D. Choi

This paper investigates a new learning formulation called structured sparsity, which is a natural extension of the standard sparsity concept in statistical learning and compressive sensing. By allowing arbitrary structures on the feature…

Methodology · Statistics 2009-05-05 Junzhou Huang , Tong Zhang , Dimitris Metaxas
‹ Prev 1 2 3 10 Next ›