Exchangeability and Kernel Invariance in Trained MLPs

Authors: Russell Tsuchida, Fred Roosta, Marcus Gallagher

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate our results with selected figures. Other datasets and optimizers are investigated in the supplemental material. 5.1 Verification of Proposition 6 Architecture. We train an autoencoder with 4 layers and 3072 neurons in each layer on CIFAR10 [Krizhevsky and Hinton, 2009] with pixel values normalized to [0, 1] using an ℓ2 objective. Weights are initialized with a variance of 2 nl [He et al., 2015]. Method. In Figure 3 we plot the empirical layer-wise normalized kernel in each layer.
Researcher Affiliation Academia Russell Tsuchida1 , Fred Roosta1,2 and Marcus Gallagher1 1The University of Queensland 2International Computer Science Institute
Pseudocode Yes Procedure 1 Sample θ(l 1) Inputs datapoint x, θ(l 1) Output y at angle θ(l 1) to x. a Set the last two coordinates of x to 0. 1 Sample random vector p orthogonal to x: Set all coordinates of p to zero where x is non-zero and sample remaining coordinates of p from U[0, 1]. Set last two coordinates to 0. Normalize p so that x = p . b Set the second last coordinate of x to the negative sum of all coordinates of x. c Set the last coordinate of p to the negative sum of all coordinates of p. 2 Return y = cos θ(l 1)x + sin θ(l 1)p.
Open Source Code No The paper states 'Supplemental material available at the ar Xiv version https://arxiv. org/abs/1810.08351.', which points to the paper itself on arXiv, not a code repository. There is no explicit statement about releasing the code for the described methodology.
Open Datasets Yes We train an autoencoder with 4 layers and 3072 neurons in each layer on CIFAR10 [Krizhevsky and Hinton, 2009]
Dataset Splits No The paper mentions using CIFAR10 but does not specify the training, validation, or test splits (e.g., percentages or sample counts) used for the experiments.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments (e.g., CPU, GPU models, memory, or specific computing clusters).
Software Dependencies No The paper does not list any specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions or other libraries).
Experiment Setup Yes Weights are initialized with a variance of 2 nl [He et al., 2015]. ...First 3 rows: adam using step size 0.001, β1 = 0.9, β2 = 0.999, ε = [10 16, 10 8, 1]. Last row: SGD with constant learning rate 0.5.