The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
Authors: Daniel Kunin, Atsushi Yamamura, Chao Ma, Surya Ganguli
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | All empirical figures in this work were generated by the attached notebook. Here we briefly summarize the experimental conditions used to generate these figures. ... Using these initializations and Sci Py s initial value problem ODE solver Virtanen et al. (2020) we then simulated gradient flow until T = 1e5. The final value of the classifier for both models and their respective robustness was recorded and used to generate the final plots. |
| Researcher Affiliation | Academia | Daniel Kunin & Atsushi Yamamura (山村篤志) Stanford University {kunin,atsushi3}@stanford.edu Chao Ma, Surya Ganguli Stanford University {chaoma,sganguli}@stanford.edu |
| Pseudocode | No | No, the paper describes theoretical concepts and proofs but does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | All empirical figures in this work were generated by the attached notebook. |
| Open Datasets | No | Logistic Regression (Fig. 1). This plot was generated by sampling 100 sample from two Gaussian distributions N(µ, σI) in R2 where µ = [1/√2] and σ = 0.25. ... Ball Classification (Fig. 4). This plot was generated by sampling 1e4 random samples from the surface of two balls B(µ, r) in R3 for 100 linearly spaced radii r ∈ [0, 1]. |
| Dataset Splits | No | No, the paper describes synthetic data generation and gradient flow simulations but does not specify any training/validation/test dataset splits. |
| Hardware Specification | No | No, the paper describes the simulation of gradient flow and the use of the SciPy library but does not specify any hardware details like CPU or GPU models. |
| Software Dependencies | Yes | Using these initializations and Sci Py s initial value problem ODE solver Virtanen et al. (2020) we then simulated gradient flow until T = 1e5. ... The maximum ℓ2-margin solution was computed using scikit-learn s SVM package Pedregosa et al. (2011). |
| Experiment Setup | Yes | The parameters were trained with full batch gradient descent with a learning rate η = 0.5 for 1e5 steps. ... Using these initializations and Sci Py s initial value problem ODE solver Virtanen et al. (2020) we then simulated gradient flow until T = 1e5. |