Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning Representations by Humans, for Humans
Authors: Sophie Hilgard, Nir Rosenfeld, Mahzarin R Banaji, Jack Cao, David Parkes
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically demonstrate the successful application of the framework to various tasks and representational forms.We report the results of three distinct experiments. |
| Researcher Affiliation | Academia | 1School of Engineering and Applied Science, Harvard University, Cambridge, MA, USA 2Department of Computer Science, Technion Israel Institute of Technology 3Department of Psychology, Harvard University, Cambridge, MA, USA. |
| Pseudocode | Yes | Figure 1 (right) illustrates this process; for pseudocode see Appendix A). |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its source code, nor does it include a link to a code repository. |
| Open Datasets | Yes | Specifically we use the Lending Club dataset, focusing on the resolved loans, i.e., loans that were paid in full (y = 1) or defaulted (y = 0), and only using features that would have been available to lenders at loan inception.3 https://www.kaggle.com/wendykan/lending-club-loan-data |
| Dataset Splits | No | We report results averaged over ten random data samples of size 1,000 with an 80-20 train-test split. (The paper specifies a train-test split but does not explicitly mention a separate validation split.) |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions the use of 'neural networks' and 'Webmorph software package' but does not provide specific version numbers for any software dependencies or libraries required for replication. |
| Experiment Setup | Yes | We set φ to be a small, fully connected network with a single 25-hidden unit layer, mapping inputs to representation vectors z R9. For ˆh we use a small, fully connected network with two layers of size 20 each, operating directly on representation vectors z. we use d = 3 and set φ to be a 3x2 linear mapping with parameters θ as a 3x2 matrix. For this, we use for ˆh a small, single-layer 3x3 convolutional network, that takes as inputs a differentiable 6x6 histogram over the 2D projections. |