Rich Feature Construction for the Optimization-Generalization Dilemma
Authors: Jianyu Zhang, David Lopez-Paz, Leon Bottou
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This section presents experimental results that illustrate how the rich representations constructed with RFC can help the Oo D performance and reduce the performance variance of Oo D methods. |
| Researcher Affiliation | Collaboration | 1New York University, New York, NY, USA. 2Facebook AI Research, Paris, France. 3Facebook AI Research, New York, NY, USA. |
| Pseudocode | Yes | Algorithm 1 Robust Empirical Risk Minimization (RERM) |
| Open Source Code | Yes | Code for replicating these experiments is publicly available at https://github.com/Tju Jianyu/RFC/. |
| Open Datasets | Yes | All experiments reported in this section use the COLOREDMNIST task (Arjovsky et al., 2020)... The CAMELYON17 dataset (Bandi et al., 2018) |
| Dataset Splits | Yes | The task also specifies multiple runs with different seeds in order to observe the result variability. Finally the task defines two ways to perform hyper-parameter selection: IID Tune selects hyper-parameters based on model performance on 33,560 images held out from the training data, Oo D Tune selects hyper-parameters on the model performance observed on the fifth hospital (34,904 images). |
| Hardware Specification | No | The paper describes network architectures and training settings (e.g., "The network is a Dense Net121 model", "2-hidden-layers MLP network architecture"), but does not specify the particular hardware (CPU, GPU models, etc.) used for running the experiments. |
| Software Dependencies | No | The paper mentions optimizers (Adam, SGD) and model architectures (Dense Net121), but it does not provide specific version numbers for any software libraries (e.g., PyTorch, TensorFlow) or programming languages (e.g., Python). |
| Experiment Setup | Yes | All experiments use the same 2-hidden-layers MLP network architecture (390 hidden neurons), Adam optimizer, learning rate=0.0005, L2 weights regularization=0.0011 and binary cross-entropy objective function as the COLOREDMNIST benchmark (Arjovsky et al., 2020). The network is a Dense Net121 model (Huang et al., 2017) trained by optimizing a cross-entropy loss with L2 weight decay=0.01 using SGD with learning rate=0.001, momentum=0.9 and batch size=32. |