Discounted Adaptive Online Learning: Towards Better Regularization
Authors: Zhiyu Zhang, David Bombara, Heng Yang
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we complement the above theoretical results with OCP experiments (Section 4). ... We test three versions of our algorithm: MAGL-D is our Algorithm 1 with ε = 1 and λt = 0.999; MAGL is its undiscounted version (λt = 1); and MAGDIS is a much simplified variant of Algorithm 1 that basically sets ht = 0. ... The results are summarized in Table 1. |
| Researcher Affiliation | Academia | 1Harvard University. Correspondence to: Zhiyu Zhang <zhiyuz@seas.harvard.edu>, David Bombara <davidbombara@g.harvard.edu>, Heng Yang <hankyang@seas.harvard.edu>. |
| Pseudocode | Yes | Algorithm 1 1D magnitude learner on [0, ). ... Algorithm 2 ((Zhang et al., 2024, Algorithm 1 and 2) + (Cutkosky, 2019, Algorithm 2)) Undiscounted 1D magnitude learner on [0, ). ... Algorithm 3 Discounted adaptivity on Rd. ... Algorithm 4 The proposed OCP algorithm ACP . ... Algorithm 5 Modified 1D magnitude learner on [0, ). |
| Open Source Code | Yes | Link to the code: https://github.com/Computational Robotics/discounted-adaptive. |
| Open Datasets | Yes | We consider image classification in a sequential setting... We adopt the procedure, code, and base model from (Bhatnagar et al., 2023). ... Results are obtained using corrupted versions of Tiny Image Net, with time-varying corruption level (distribution shift). |
| Dataset Splits | No | The paper does not explicitly provide details about training, validation, and test dataset splits with percentages or sample counts. It mentions evaluation metrics like "local coverage error" but not data partitioning for validation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as CPU/GPU models or memory. |
| Software Dependencies | No | The paper mentions software packages like "SCIPY and JAX" but does not provide specific version numbers for these or any other software dependencies crucial for replication. |
| Experiment Setup | Yes | We choose a targeted coverage rate of 90% for all experiments, which means α = 0.1. ... We test three versions of our algorithm: MAGL-D is our Algorithm 1 with ε = 1 and λt = 0.999; MAGL is its undiscounted version (λt = 1); and MAGDIS is a much simplified variant of Algorithm 1 that basically sets ht = 0. ... The only hyperparameter that we set is the learning rate, which we set to 1 for Simple OGD. |