reproducibilityindex.ai

Measuring Compositionality in Representation Learning

Authors: Jacob Andreas

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present experiments and analyses aimed at answering four questions about the relationship between compositionality and learning: How does compositionality of representations evolve in relation to other measurable model properties over the course of the learning process? (Section 4) How well does compositionality of representations track human judgments about the compositionality of model inputs? (Section 5) How does compositionality constrain distances between representations, and how does TRE relate to other methods that analyze representations based on similarity? (Section 6) Are compositional representations necessary for generalization to out-of-distribution inputs? (Section 7)
Researcher Affiliation	Academia	Jacob Andreas Computer Science Division University of California, Berkeley jda@cs.berkeley.edu
Pseudocode	No	The paper provides mathematical formulations and descriptions of procedures, such as the Tree Reconstruction Error (TRE) calculation, but it does not include a distinct block or figure explicitly labeled 'Pseudocode' or 'Algorithm'.
Open Source Code	Yes	Code and data for all experiments in this paper are provided at https://github.com/jacobandreas/tre.
Open Datasets	Yes	Because our analysis focuses on compositional hypothesis classes, we use visual concepts from the Color MNIST dataset of Seo et al. (2017) (Figure 2). [...] We train embeddings for words and bigrams using the CBOW objective of Mikolov et al. (2013) using the implementation provided in Fast Text (Bojanowski et al., 2017) [...] Vectors are estimated from a 250M-word subset of the Gigaword dataset (Parker et al., 2011).
Dataset Splits	Yes	The training dataset consists of 9000 image triplets, evenly balanced between positive and negative classes, with a validation set of 500 examples.
Hardware Specification	No	The paper does not specify any particular hardware used for running the experiments, such as CPU/GPU models, memory, or cloud instance types. It only mentions the use of CNNs and RNNs for the models.
Software Dependencies	No	The paper mentions software tools like Fast Text and optimization algorithms like ADAM, but it does not provide specific version numbers for these or any other software libraries or environments, which are necessary for full reproducibility.
Experiment Setup	Yes	The model is trained using ADAM (Kingma & Ba, 2014) with a learning rate of .001 and a batch size of 128. Training is ended when the model stops improving on a held-out set. [...] We train embeddings for words and bigrams using the CBOW objective of Mikolov et al. (2013) using the implementation provided in Fast Text (Bojanowski et al., 2017) with 100-dimensional vectors and a context size of 5. [...] The encoder and decoder RNNs both use gated recurrent units (Cho et al., 2014) with embeddings and hidden states of size 256. The size of the discrete vocabulary is set to 16 and the maximum message length to 4. Training uses a policy gradient objective with a scalar baseline set to the running average reward; this is optimized using ADAM (Kingma & Ba, 2014) with a learning rate of .001 and a batch size of 256. Each model is trained for 500 steps.