reproducibilityindex.ai

Escaping the Gravitational Pull of Softmax

Authors: Jincheng Mei, Chenjun Xiao, Bo Dai, Lihong Li, Csaba Szepesvari, Dale Schuurmans

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In addition to proving bounds on convergence rates to ﬁrmly establish these results, we also provide experimental evidence for the superiority of the escort transformation. ... We conduct several experiments to verify the effectiveness of the proposed escort transform in policy gradient and cross entropy minimization.
Researcher Affiliation	Collaboration	1University of Alberta 2Deep Mind 3Amazon 4Google Research, Brain Team
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	Next, we do experiments on MNIST dataset.
Dataset Splits	Yes	The dataset is split into training set with 55000, validation set with 5000, and testing set with 10000 data points.
Hardware Specification	No	The paper does not specify any hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions implementing models using 'one hidden layer Re LU neural network' and 'mini-batch stochastic gradient descent', but it does not specify any software names with version numbers (e.g., PyTorch, TensorFlow, Python version) that would be needed for reproduction.
Experiment Setup	Yes	Full gradient SPG updates with stepsize η = 0.4. ... with learning rate ηt = θt 2 p 4 (3+c2 1). ... We use one hidden layer neural network with 512 hidden nodes and Re LU activation to parameterize θ. ... We use mini-batch stochastic gradient descent in this experiment.