Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Minimizing FLOPs to Learn Efficient Sparse Representations
Authors: Biswajit Paria, Chih-Kuan Yeh, Ian E.H. Yen, Ning Xu, Pradeep Ravikumar, Barnabás Póczos
ICLR 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that our approach is competitive to the other baselines and yields a similar or better speed-vs-accuracy tradeoff on practical datasets1. We perform an empirical evaluation of our approach on the Megaface dataset (Kemelmacher-Shlizerman et al., 2016), and show that our proposed method successfully learns high-dimensional sparse embeddings that are orders-of-magnitude faster. We compare our approach to multiple baselines demonstrating an improved or similar speed-vs-accuracy trade-off. |
| Researcher Affiliation | Collaboration | Carnegie Mellon University, Moffett AI, Amazon EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Sparse Nearest Neighbour |
| Open Source Code | Yes | The implementation is available at https://github.com/biswajitsc/sparse-embed |
| Open Datasets | Yes | We evaluate our proposed approach on a large scale metric learning dataset: the Megaface (Kemelmacher-Shlizerman et al., 2016) used for face recognition. ... we train on a refined version of the MSCeleb-1M (Guo et al., 2016) dataset released by Deng et al. (2018) consisting of 1 million images spanning 85k classes. |
| Dataset Splits | No | The paper describes training on MSCeleb-1M and evaluating on Megaface/Facescrub, but does not explicitly specify a separate validation dataset split or its details. |
| Hardware Specification | Yes | All models were trained on 4 NVIDIA Tesla V-100 GPUs with 16G of memory. ... CPU: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz. |
| Software Dependencies | No | The paper mentions 'Tensorflow (Abadi et al., 2016)' and 'C++' but does not specify exact version numbers for TensorFlow or other key software dependencies or libraries. |
| Experiment Setup | Yes | For the Arcloss function, we used the recommended parameters of margin m = 0.5 and temperature s = 64. We trained our models on 4 NVIDIA Tesla V-100 GPUs using SGD with a learning rate of 0.001, momentum of 0.9. Both the architectures were trained for a total of 230k steps, with the learning rate being decayed by a factor of 10 after 170k steps. We use a batch size of 256 and 64 per GPU for Mobile Face Net for Res Net respectively. ... The regularization parameter λ for the e F regularizer was varied as 200, 300, 400, 600. ... The PCA dimension is varied as 64, 96, 128, 256. ... For IVF-PQ from the faiss library, the following parameters were fixed: nlist=4096, M=64, nbit=8, and nprobe was varied as 100, 150, 250, 500, 1000. |