Test-time Collective Prediction
Authors: Celestine Mendler-Dünner, Wenshuo Guo, Stephen Bates, Michael Jordan
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On the empirical side, we demonstrate the efficacy of our mechanism through extensive numerical experiments across different learning scenarios. In particular, we illustrate the mechanism s advantages over model averaging as well as model selection, and demonstrate that it consistently outperforms alternative non-uniform combination schemes that have access to additional validation data across a wide variety of models and datasets. |
| Researcher Affiliation | Academia | Celestine Mendler-Dünner MPI for Intelligent Systems, Tübingen cmendler@tuebingen.mpg.de Wenshuo Guo University of California, Berkeley wguo@cs.berkeley.edu Stephen Bates University of California, Berkeley stephenbates@cs.berkeley.edu Michael I. Jordan University of California, Berkeley jordan@cs.berkeley.edu |
| Pseudocode | Yes | Algorithm 1 De Groot Aggregation |
| Open Source Code | No | The paper does not contain an unambiguous statement that the authors are releasing the source code for the work described, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | We work with the abalone dataset [Nash et al., 1994]... Datasets have been downloaded from [Fan, 2011]. |
| Dataset Splits | No | The paper describes how individual agents use local data for validation (e.g., 'Construct local validation dataset Di(x ) using N-nearest neighbors of x in Di.') and compares against methods using 'additional validation data', but it does not provide specific, global train/validation/test dataset splits (e.g., percentages or exact counts) for the overall experiments needed for reproduction. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU/CPU models or memory specifications. |
| Software Dependencies | No | The paper mentions software like 'scikit-learn' and 'Python' but does not specify version numbers for these or other key software components used in the experiments. |
| Experiment Setup | Yes | Unless stated otherwise, we use K = 5 agents and let each agent fit a linear model to her local data... We use N = 5 for local cross-validation in De Groot... For our first experiment... we train a lasso model on each agent with regularization parameter λk = λ that achieves a sparsity of 0.8... We choose N to be 1% of the data partition for all schemes (with a had lower bound at 2). |