Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
The Efficiency Misnomer
Authors: Mostafa Dehghani, Yi Tay, Anurag Arnab, Lucas Beyer, Ashish Vaswani
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experiments where comparing model efficiency strongly depends on the choice of cost indicator, like scenarios where there is parameter sharing, sparsity, or parallelizable operations in the model. |
| Researcher Affiliation | Industry | Google Research EMAIL |
| Pseudocode | No | The paper describes its methods in narrative text and does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions using existing libraries like Scenic and timm for experiments but does not provide a link or explicit statement about releasing the source code for its own methodology. |
| Open Datasets | Yes | Figure 5: Accuracy and value of different cost indicators for different models on Image Net dataset. Figure 2: The learning progress of a Res Net-101 3 on JFT-300M with short and long schedules, obtained from (Kolesnikov et al., 2020). |
| Dataset Splits | No | The paper discusses experiments on datasets like JFT-300M and ImageNet and mentions keeping hyperparameters fixed from referenced papers, but does not explicitly state specific training/validation/test dataset splits. |
| Hardware Specification | Yes | Experiments and the computation of cost metrics were done with Mesh Tensorflow (Shazeer et al., 2018), using 64 TPU-V3. (Figure 1 caption) / throughput is measured on a V100 GPU (Figure 5 caption). |
| Software Dependencies | No | The paper mentions software frameworks like Mesh Tensorflow, Scenic, timm, JAX, PyTorch, and TensorFlow, but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | Note that when changing depth or width of the model (see Table 2 in Appendix A for the exact configurations) all other hyper-parameters are kept fixed based on the default values given by the referenced papers. |