reproducibilityindex.ai

Input Similarity from the Neural Network Perspective

Authors: Guillaume Charpiat, Nicolas Girard, Loris Felardos, Yuliya Tarabalka

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We study the mathematical properties of this similarity measure, and show how to estimate sample density with it, in low complexity, enabling new types of statistical analysis for neural networks. We also propose to use it during training, to enforce that examples known to be similar should also be seen as similar by the network. We then study the self-denoising phenomenon encountered in regression tasks when training neural networks on datasets with noisy labels. We exhibit a multimodal image registration task where almost perfect accuracy is reached, far beyond label noise variance.
Researcher Affiliation	Collaboration	Guillaume Charpiat1 Nicolas Girard2 Loris Felardos1 Yuliya Tarabalka2,3 1 TAU team, INRIA Saclay, LRI, Univ. Paris-Sud 2 TITANE team, INRIA Sophia-Antipolis, Univ. Côte d Azur 3 Lux Carta Technology
Pseudocode	No	The paper does not include any pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at https://github.com/Lydorn/netsimilarity.
Open Datasets	Yes	For this, we designed an iterative approach: train, then use the outputs of the network on the training set to re-align it; repeat (for 3 iterations). The results were surprisingly good... training on a dataset [13] with noisy annotations from OSM[18]... [13] Emmanuel Maggiori, Yuliya Tarabalka, Guillaume Charpiat, and Pierre Alliez. Can semantic labeling methods generalize to any city? the Inria aerial image labeling benchmark. In IGARSS, 2017. [18] Open Street Map contributors. Planet dump retrieved from https://planet.osm.org , 2017.
Dataset Splits	No	The paper mentions training and testing, but it does not specify explicit percentages or sample counts for training, validation, or test splits. It refers to using the 'training set' and evaluating on 'manually-aligned data', but without detailed split ratios or methodology.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments, such as GPU/CPU models or memory specifications.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers, such as programming languages or library versions (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	No	The paper describes a high-level iterative training approach ('repeat (for 3 iterations)') but does not provide specific experimental setup details such as learning rates, batch sizes, optimizers, or other hyperparameters required for reproduction.