Related papers: SDGMNet: Statistic-based Dynamic Gradient Modulati…

Learning Local Image Descriptors with Deep Siamese and Triplet Convolutional Networks by Minimising Global Loss Functions

Recent innovations in training deep convolutional neural network (ConvNet) models have motivated the design of new methods to automatically learn local image descriptors. The latest deep ConvNets proposed for this task consist of a siamese…

Computer Vision and Pattern Recognition · Computer Science 2016-08-02 Vijay Kumar B G , Gustavo Carneiro , Ian Reid

HyNet: Learning Local Descriptor with Hybrid Similarity Measure and Triplet Loss

Recent works show that local descriptor learning benefits from the use of L2 normalisation, however, an in-depth analysis of this effect lacks in the literature. In this paper, we investigate how L2 normalisation affects the back-propagated…

Computer Vision and Pattern Recognition · Computer Science 2020-11-10 Yurun Tian , Axel Barroso-Laguna , Tony Ng , Vassileios Balntas , Krystian Mikolajczyk

Metric Learning with Adaptive Density Discrimination

Distance metric learning (DML) approaches learn a transformation to a representation space where distance is in correspondence with a predefined notion of similarity. While such models offer a number of compelling benefits, it has been…

Machine Learning · Statistics 2016-03-03 Oren Rippel , Manohar Paluri , Piotr Dollar , Lubomir Bourdev

Analysis of Natural Gradient Descent for Multilayer Neural Networks

Natural gradient descent is a principled method for adapting the parameters of a statistical model on-line using an underlying Riemannian parameter space to redefine the direction of steepest descent. The algorithm is examined via methods…

Disordered Systems and Neural Networks · Physics 2009-10-31 Magnus Rattray , David Saad

Learning Spread-out Local Feature Descriptors

We propose a simple, yet powerful regularization technique that can be used to significantly improve both the pairwise and triplet losses in learning local feature descriptors. The idea is that in order to fully utilize the expressive power…

Computer Vision and Pattern Recognition · Computer Science 2017-08-22 Xu Zhang , Felix X. Yu , Sanjiv Kumar , Shih-Fu Chang

HDD-Net: Hybrid Detector Descriptor with Mutual Interactive Learning

Local feature extraction remains an active research area due to the advances in fields such as SLAM, 3D reconstructions, or AR applications. The success in these applications relies on the performance of the feature detector and descriptor.…

Computer Vision and Pattern Recognition · Computer Science 2020-11-30 Axel Barroso-Laguna , Yannick Verdie , Benjamin Busam , Krystian Mikolajczyk

Neighborhood Watch: Representation Learning with Local-Margin Triplet Loss and Sampling Strategy for K-Nearest-Neighbor Image Classification

Deep representation learning using triplet network for classification suffers from a lack of theoretical foundation and difficulty in tuning both the network and classifiers for performance. To address the problem, local-margin triplet loss…

Computer Vision and Pattern Recognition · Computer Science 2019-11-20 Phawis Thammasorn , Daniel Hippe , Wanpracha Chaovalitwongse , Matthew Spraker , Landon Wootton , Matthew Nyflot , Stephanie Combs , Jan Peeken , Eric Ford

Gradient and Variable Tracking with Multiple Local SGD for Decentralized Non-Convex Learning

Stochastic distributed optimization methods that solve an optimization problem over a multi-agent network have played an important role in a variety of large-scale signal processing and machine leaning applications. Among the existing…

Optimization and Control · Mathematics 2023-02-06 Songyang Ge , Tsung-Hui Chang

Improved Convergence for Decentralized Stochastic Optimization with Biased Gradients

Decentralized stochastic optimization has emerged as a fundamental paradigm for large-scale machine learning. However, practical implementations often rely on biased gradient estimators arising from communication compression or inexact…

Optimization and Control · Mathematics 2026-04-10 Qing Xu , Yiwei Liao , Wenqi Fan , Xingxing You , Songyi Dian

Gradient Descent based Optimization Algorithms for Deep Learning Models Training

In this paper, we aim at providing an introduction to the gradient descent based optimization algorithms for learning deep neural network models. Deep learning models involving multiple nonlinear projection layers are very challenging to…

Machine Learning · Computer Science 2019-03-12 Jiawei Zhang

TCDesc: Learning Topology Consistent Descriptors

Triplet loss is widely used for learning local descriptors from image patch. However, triplet loss only minimizes the Euclidean distance between matching descriptors and maximizes that between the non-matching descriptors, which neglects…

Computer Vision and Pattern Recognition · Computer Science 2020-06-08 Honghu Pan , Fanyang Meng , Zhenyu He , Yongsheng Liang , Wei Liu

Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models: Extension

We consider distributed optimization under communication constraints for training deep learning models. We propose a new algorithm, whose parameter updates rely on two forces: a regular gradient step, and a corrective direction dictated by…

Machine Learning · Computer Science 2022-04-29 Yunfei Teng , Wenbo Gao , Francois Chalus , Anna Choromanska , Donald Goldfarb , Adrian Weller

SA-GD: Improved Gradient Descent Learning Strategy with Simulated Annealing

Gradient descent algorithm is the most utilized method when optimizing machine learning issues. However, there exists many local minimums and saddle points in the loss function, especially for high dimensional non-convex optimization…

Machine Learning · Computer Science 2021-07-19 Zhicheng Cai

Reinforced stochastic gradient descent for deep neural network learning

Stochastic gradient descent (SGD) is a standard optimization method to minimize a training error with respect to network parameters in modern neural network learning. However, it typically suffers from proliferation of saddle points in the…

Machine Learning · Computer Science 2017-11-23 Haiping Huang , Taro Toyoizumi

A Dynamic Sampling Adaptive-SGD Method for Machine Learning

We propose a stochastic optimization method for minimizing loss functions, expressed as an expected value, that adaptively controls the batch size used in the computation of gradient approximations and the step size used to move along such…

Machine Learning · Computer Science 2020-03-04 Achraf Bahamou , Donald Goldfarb

RankNEAT: Outperforming Stochastic Gradient Search in Preference Learning Tasks

Stochastic gradient descent (SGD) is a premium optimization method for training neural networks, especially for learning objectively defined labels such as image objects and events. When a neural network is instead faced with subjectively…

Neural and Evolutionary Computing · Computer Science 2022-04-15 Kosmas Pinitas , Konstantinos Makantasis , Antonios Liapis , Georgios N. Yannakakis

Dynamically Sampled Nonlocal Gradients for Stronger Adversarial Attacks

The vulnerability of deep neural networks to small and even imperceptible perturbations has become a central topic in deep learning research. Although several sophisticated defense mechanisms have been introduced, most were later shown to…

Machine Learning · Computer Science 2021-09-28 Leo Schwinn , An Nguyen , René Raab , Dario Zanca , Bjoern Eskofier , Daniel Tenbrinck , Martin Burger

Distributed Stochastic Gradient Descent Using LDGM Codes

We consider a distributed learning problem in which the computation is carried out on a system consisting of a master node and multiple worker nodes. In such systems, the existence of slow-running machines called stragglers will cause a…

Information Theory · Computer Science 2019-01-16 Shunsuke Horii , Takahiro Yoshida , Manabu Kobayashi , Toshiyasu Matsushima

Dissecting Deep Metric Learning Losses for Image-Text Retrieval

Visual-Semantic Embedding (VSE) is a prevalent approach in image-text retrieval by learning a joint embedding space between the image and language modalities where semantic similarities would be preserved. The triplet loss with…

Computer Vision and Pattern Recognition · Computer Science 2022-10-25 Hong Xuan , Xi Chen

Cooperative SGD with Dynamic Mixing Matrices

One of the most common methods to train machine learning algorithms today is the stochastic gradient descent (SGD). In a distributed setting, SGD-based algorithms have been shown to converge theoretically under specific circumstances. A…

Machine Learning · Computer Science 2025-08-22 Soumya Sarkar , Shweta Jain