Exchangeability and Kernel Invariance in Trained MLPs
Authors: Russell Tsuchida, Fred Roosta, Marcus Gallagher
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate our results with selected figures. Other datasets and optimizers are investigated in the supplemental material. 5.1 Verification of Proposition 6 Architecture. We train an autoencoder with 4 layers and 3072 neurons in each layer on CIFAR10 [Krizhevsky and Hinton, 2009] with pixel values normalized to [0, 1] using an ℓ2 objective. Weights are initialized with a variance of 2 nl [He et al., 2015]. Method. In Figure 3 we plot the empirical layer-wise normalized kernel in each layer. |
| Researcher Affiliation | Academia | Russell Tsuchida1 , Fred Roosta1,2 and Marcus Gallagher1 1The University of Queensland 2International Computer Science Institute |
| Pseudocode | Yes | Procedure 1 Sample θ(l 1) Inputs datapoint x, θ(l 1) Output y at angle θ(l 1) to x. a Set the last two coordinates of x to 0. 1 Sample random vector p orthogonal to x: Set all coordinates of p to zero where x is non-zero and sample remaining coordinates of p from U[0, 1]. Set last two coordinates to 0. Normalize p so that x = p . b Set the second last coordinate of x to the negative sum of all coordinates of x. c Set the last coordinate of p to the negative sum of all coordinates of p. 2 Return y = cos θ(l 1)x + sin θ(l 1)p. |
| Open Source Code | No | The paper states 'Supplemental material available at the ar Xiv version https://arxiv. org/abs/1810.08351.', which points to the paper itself on arXiv, not a code repository. There is no explicit statement about releasing the code for the described methodology. |
| Open Datasets | Yes | We train an autoencoder with 4 layers and 3072 neurons in each layer on CIFAR10 [Krizhevsky and Hinton, 2009] |
| Dataset Splits | No | The paper mentions using CIFAR10 but does not specify the training, validation, or test splits (e.g., percentages or sample counts) used for the experiments. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments (e.g., CPU, GPU models, memory, or specific computing clusters). |
| Software Dependencies | No | The paper does not list any specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions or other libraries). |
| Experiment Setup | Yes | Weights are initialized with a variance of 2 nl [He et al., 2015]. ...First 3 rows: adam using step size 0.001, β1 = 0.9, β2 = 0.999, ε = [10 16, 10 8, 1]. Last row: SGD with constant learning rate 0.5. |