reproducibilityindex.ai

Generalization Bounds in the Presence of Outliers: a Median-of-Means Study

Authors: Pierre Laforgue, Guillaume Staerman, Stephan Clémençon

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conclude by an empirical evaluation of Theorem 2. In metric learning, one is interested in learning a distance... we consider...the iris dataset... The whole dataset is first normalized and divided into a train set of size 80% and a test set of size 20%. Then, the training data is contaminated with 10% of outliers... Standard and Mo U Gradient Descents are run... The descent trajectories on the test data, averaged over 100 runs, are plotted in Figure 7c. Mo U-GD remarkably resists to the presence of outliers, and shows test performance comparable to the sane GD.
Researcher Affiliation	Academia	1Università degli Studi di Milano, Italy 2LTCI, Télécom Paris, Institut Polytechnique de Paris, France. Correspondence to: Pierre Laforgue <pierre.laforgue@unimi.it>.
Pseudocode	Yes	Algorithm 1 Mo U Gradient Descent (Mo U-GD) input : Sn, K, T N , (γt)t T RT +, u0 Rp for epoch from 1 to T do // Randomly partition the data Choose a random permutation π of {1, . . . , n} Build a partition B1, . . . , Bk of {π(1), . . . , π(n)} // Select block with median risk i<j B2 k ℓ(gut, Zi, Zj) Set Bmed s.t. ˆUBmed = median( ˆUBk, . . . ˆUBK) // Gradient step ut+1 = ut γt P i<j B2 k utℓ(gut, Zi, Zj)
Open Source Code	No	No explicit statement providing access to the source code for the methodology described in this paper was found. No repository link or code release statement was provided.
Open Datasets	Yes	We consider... the iris dataset, that gathers 4 attributes (sepal length, sepal width, petal length, and petal width) of 150 flowers issued from 3 different types of irises.
Dataset Splits	No	The paper mentions 'divided into a train set of size 80% and a test set of size 20%,' but does not specify a separate validation split or explicit cross-validation setup for hyperparameter tuning.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory amounts, or cloud instance types) used for running experiments were mentioned in the paper.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., library names with versions) were mentioned in the paper, beyond general algorithm names.
Experiment Setup	No	The paper states that 'Standard and Mo U Gradient Descents are run (with a projection step on S+ q (R), and K chosen according to the harmonic upper bound),' and 'averaged over 100 runs.' However, it does not provide specific details on hyperparameters such as learning rate, batch size, number of epochs, or optimizer settings, which are crucial for reproducing the experimental setup.