Uniform-in-time propagation of chaos for the mean-field gradient Langevin dynamics
Authors: Taiji Suzuki, Atsushi Nitanda, Denny Wu
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6 NUMERICAL EXPERIMENTS We provide empirical support for our propagation of chaos result in a synthetic student-teacher setting. ... For the figures, we report the training or test error without the regularization terms. In Figure 1 in the Introduction, we plot the training error for r(x) = x 4; whereas in Figure 2, we plot the test error for r(x) = x 2. |
| Researcher Affiliation | Academia | Taiji Suzuki The University of Tokyo RIKEN Center for Advanced Intelligence Project... Atsushi Nitanda Kyushu Institute of Technology RIKEN Center for Advanced Intelligence Project... Denny Wu The University of Toronto Vector Institute for Artificial intelligence |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide a statement about open-sourcing its code or a link to a code repository for the methodology described. |
| Open Datasets | No | We consider the empirical risk minimization problem, where the training labels are generated by a teacher model which is a Gaussian function defined as f (z) = exp z a 2 2d . We set n = 2000, d = 20. |
| Dataset Splits | No | The paper does not provide specific training/validation/test dataset splits or percentages. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used to run its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | We set n = 2000, d = 20. The loss is chosen to be the squared error, and for the regularization term we set r(x) = x 2 or r(x) = x 4, and the regularization strength λ1 = λ = 10 2. ... We optimize the student model using NPGD with step size η = 10 2. |