reproducibilityindex.ai

Importance Sampling Tree for Large-scale Empirical Expectation

Authors: Olivier Canevet, Cijo Jose, Francois Fleuret

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4. Experiments and results
Researcher Affiliation	Academia	Idiap Research Institute, Martigny, Switzerland Ecole Polytechnique F ed erale de Lausanne (EPFL), Lausanne, Switzerland
Pseudocode	No	The paper describes steps for its methods but does not include a clearly labeled 'Pseudocode' or 'Algorithm' block, nor does it format procedures like code.
Open Source Code	No	The paper mentions an external repository for a network design ('1https://github.com/nagadomi/kaggle-cifar10-torch7') but does not state that its own source code for the proposed Importance Sampling Tree (IST) methodology is concrete available.
Open Datasets	Yes	Our experiments replicate the training of a network1 designed for a Kaggle competition on the CIFAR10 dataset (Krizhevsky & Hinton, 2009) and We applied this IST method to the Gaussian kernel SVM trained on the Covertype data-set (Bache & Lichman).
Dataset Splits	Yes	For all variants, we also sample 1, 000 samples uniformly initially as a validation set.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions 'Torch7' in a footnote related to a CNN implementation but does not provide specific version numbers for it or any other software dependencies crucial for replication.
Experiment Setup	Yes	We train a neural network with two units as input standing for the coordinate in the [0, 1]2 domain, two fully connected hidden layers with 40 units each, and one output unit. The transfer function is the hyperbolic tangent, and the weights are initialized layer after layer so that the response of every unit before non-linearity is centered, of standard deviation 0.5. We use the quadratic loss for training, and a pure stochastic gradient descent, one sample at a time. Every 1, 000 gradient steps, we compute a validation loss and adapt the step size.