Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Breaking the gridlock in Mixture-of-Experts: Consistent and Efficient Algorithms
Authors: Ashok Makkuva, Pramod Viswanath, Sreeram Kannan, Sewoong Oh
ICML 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically validate our algorithm on both the synthetic and real data sets in a variety of settings, and show superior performance to standard baselines. |
| Researcher Affiliation | Academia | 1Department of Electrical and Computer Engineering, Coordinated Science Laboratory, University of Illinois at Urbana Champaign, IL, USA 2Allen School of Computer Science & Engineering, University of Washington, Seattle, USA 3Department of Electrical Engineering, University of Washington, Seattle, USA. |
| Pseudocode | Yes | Algorithm 1 Learning the regressors... Algorithm 2 Learning the gating parameter |
| Open Source Code | Yes | Codes are available at this repository Mo E codes. |
| Open Datasets | Yes | To highlight the generalizability of our algorithm, in Appendix H.2 of the supplement, we compare the performance of our algorithm to that of the standard approaches on a variety of real world datasets. References include: Brooks, T., Pope, D., and Marcolini., A. Airfoil self-noise and prediction. Technical report, NASA, 1989. URL https://archive.ics.uci.edu/ ml/datasets/Airfoil+Self-Noise. Liu, Y.-C. and Yeh, I.-C. Using mixture design and neural networks to build stock selection decision support systems. Neural Computing and Applications, 28(3): 521 535, 2017. doi: 10.1007/s00521-015-2090-x. URL https://archive.ics.uci.edu/ml/ datasets/Stock+portfolio+performance. Yeh, I.-C. Modeling of strength of high performance concrete using arti๏ฌcial neural networks. Cement and Concrete Research, 28(12):1797 1808, 1998. URL https: //archive.ics.uci.edu/ml/datasets/ Concrete+Compressive+Strength. |
| Dataset Splits | No | The paper describes generating synthetic data with parameters like n=2000 or n=8000 and d=10, and also mentions using real-world datasets, but it does not specify explicit training, validation, or test splits (e.g., percentages or counts) for these datasets. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper mentions using the "Orth-ALS package by (Sharan & Valiant, 2017)" but does not provide a specific version number for this or any other software dependency. |
| Experiment Setup | Yes | For the experiments, we consider the similar setting as before with k = 2, d = 10, ฯ = 0.1 and the gating parameter w is drawn uniformly from S9 without the orthogonality restriction. We let xi i.i.d. N(0, Id). We choose n = 2000... We let the number of mixture components be k = 3 and k = 4. We let x N(0, Id) and the gating parameters are drawn uniformly from S9... n = 8000, d = 10, ฯ = 0.5. |