Generalizing Similarity In Noisy Setups: The DIBS Phenomenon
2022 Β· Nayara Fonseca, Veronica Guidetti
Abstract
This work uncovers an interplay among data density, noise, and the generalization ability in similarity learning. We consider Siamese Neural Networks (SNNs), which are the basic form of contrastive learning, and explore two types of noise that can impact SNNs, Pair Label Noise (PLN) and Single Label Noise (SLN). Our investigation reveals that SNNs exhibit double descent behaviour regardless of the training setup and that it is further exacerbated by noise. We demonstrate that the density of data pairs is crucial for generalization. When SNNs are trained on sparse datasets with the same amount of PLN or SLN, they exhibit comparable generalization properties. However, when using dense datasets, PLN cases generalize worse than SLN ones in the overparametrized region, leading to a phenomenon we call Density-Induced Break of Similarity (DIBS). In this regime, PLN similarity violation becomes macroscopical, corrupting the dataset to the point where complete interpolation cannot be achieved,
Authors
(none)
Tags
Stats
Related papers
- Adaptive Hierarchical Similarity Metric Learning With Noisy Labels (2021)10.74
- Unsupervised Feature Learning Via Non-parametric Instance-level Discrimination (2018)25.66
- Learning Deep Optimal Embeddings With Sinkhorn Divergences (2022)0.00
- Signal-to-noise Ratio: A Robust Distance Metric For Deep Metric Learning (2019)13.60
- Semi-supervised Learning Using Siamese Networks (2021)7.50
- Revisiting Training Strategies And Generalization Performance In Deep Metric Learning (2020)5.08
- Conditional Similarity Networks (2016)15.06
- Efficient Similarity-preserving Unsupervised Learning Using Modular Sparse Distributed Codes And Novelty-contingent Noise (2020)0.00