Input Similarity from the Neural Network Perspective
Authors: Guillaume Charpiat, Nicolas Girard, Loris Felardos, Yuliya Tarabalka
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We study the mathematical properties of this similarity measure, and show how to estimate sample density with it, in low complexity, enabling new types of statistical analysis for neural networks. We also propose to use it during training, to enforce that examples known to be similar should also be seen as similar by the network. We then study the self-denoising phenomenon encountered in regression tasks when training neural networks on datasets with noisy labels. We exhibit a multimodal image registration task where almost perfect accuracy is reached, far beyond label noise variance. |
| Researcher Affiliation | Collaboration | Guillaume Charpiat1 Nicolas Girard2 Loris Felardos1 Yuliya Tarabalka2,3 1 TAU team, INRIA Saclay, LRI, Univ. Paris-Sud 2 TITANE team, INRIA Sophia-Antipolis, Univ. Côte d Azur 3 Lux Carta Technology |
| Pseudocode | No | The paper does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/Lydorn/netsimilarity. |
| Open Datasets | Yes | For this, we designed an iterative approach: train, then use the outputs of the network on the training set to re-align it; repeat (for 3 iterations). The results were surprisingly good... training on a dataset [13] with noisy annotations from OSM[18]... [13] Emmanuel Maggiori, Yuliya Tarabalka, Guillaume Charpiat, and Pierre Alliez. Can semantic labeling methods generalize to any city? the Inria aerial image labeling benchmark. In IGARSS, 2017. [18] Open Street Map contributors. Planet dump retrieved from https://planet.osm.org , 2017. |
| Dataset Splits | No | The paper mentions training and testing, but it does not specify explicit percentages or sample counts for training, validation, or test splits. It refers to using the 'training set' and evaluating on 'manually-aligned data', but without detailed split ratios or methodology. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments, such as GPU/CPU models or memory specifications. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers, such as programming languages or library versions (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | No | The paper describes a high-level iterative training approach ('repeat (for 3 iterations)') but does not provide specific experimental setup details such as learning rates, batch sizes, optimizers, or other hyperparameters required for reproduction. |