On the Implicit Bias of Dropout
Authors: Poorya Mianjy, Raman Arora, Rene Vidal
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We specialize our results to matrix factorization in Section 5, and in Section 6, we discuss preliminary experiments to support our theoretical results. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Johns Hopkins University, Baltimore, USA 2Department of Biomedical Engineering, Johns Hopkins University, Baltimore, USA. |
| Pseudocode | Yes | Algorithm 1 Dropout with Stochastic Gradient Descent Algorithm 2 EQZ(U) equalizer of an auto-encoder h U,U Algorithm 3 Polynomial time solver for Problem 7 |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | The input features are sampled for a standard normal distribution. The input x R80 is distributed according to the standard Normal distribution. The output y R120 is generated as y = Mx, where M R120 80 is drawn randomly by uniformly sampling the right and left singular subspaces and with a spectrum decaying exponentially. The paper describes how data was generated or sampled for the experiments but does not use a publicly available dataset with concrete access information. |
| Dataset Splits | No | The paper does not specify exact split percentages or sample counts for training, validation, or test sets. It describes the generation of data but not how it was partitioned for different phases of experimentation. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch x.x, CUDA x.x) required to replicate the experiments. |
| Experiment Setup | Yes | Figure 3 illustrates the behavior of Algorithm 1 for different values of the regularization parameter (λ {0.1, 0.5, 1}), and for different sizes of factors (r {20, 80}). The curve in blue shows the objective value for the iterates of dropout, and the line in red shows the optimal value of the objective (i.e. objective for a global optimum found using Theorem 3.6). All plots are averaged over 50 runs of Algorithm 1 (averaged over different random initializations, random realizations of Bernoulli dropout, as well as random draws of training examples). |