Metric-Optimized Example Weights
Authors: Sen Zhao, Mahdi Milani Fard, Harikrishna Narasimhan, Maya Gupta
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the performance of the proposed method on diverse public benchmark datasets and real-world applications. In this section, we illustrate the value of our proposal by comparing it to common strategies on a diverse set of example problems. |
| Researcher Affiliation | Industry | 1Google AI, 1600 Amphitheatre Parkway, Mountain View, CA 94043, USA. Correspondence to: Sen Zhao <senzhao@google.com>. |
| Pseudocode | Yes | Algorithm 1 Get optimal ˆα and ˆθ(ˆα) and Algorithm 2 Get Candidate αi+1 |
| Open Source Code | Yes | The code on public datasets is available at the following Git Hub address: https://github.com/google-research/googleresearch/tree/master/moew. |
| Open Datasets | Yes | MNIST handwritten digit database (Le Cun & Cortes, 2010), wine reviews dataset from Kaggle (www.kaggle.com/zynicide/wine-reviews), Communities and Crime dataset from the UCI Machine Learning Repository (Dheeru & Karra Taniskidou, 2017) |
| Dataset Splits | Yes | training/validation/test split of sizes 55k/5k/10k respectively. (MNIST), training/validation/test split of sizes 85k/12k/24k respectively. (Wine Reviews), 994/500/500 training/validation/testing examples purely randomly. (Communities and Crime) |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models or exact server configurations used for experiments. |
| Software Dependencies | No | The paper mentions using 'Adam optimizer' but does not specify software dependencies with version numbers (e.g., TensorFlow, PyTorch versions). |
| Experiment Setup | Yes | Both the autoencoder and the main models were trained for 10k steps using Adam optimizer (Kingma & Ba, 2015) with learning rate 0.001. We used squared loss for numeric, hinge loss for binary, and cross-entropy loss for multiclass label/features. We sampled B K candidate α s in a d-dimensional ball of radius R using GP-BUCB with p = q = 68.3 and an RBF kernel, whose kernel width was set to be equal to R. |