reproducibilityindex.ai

Generating Correct Answers for Progressive Matrices Intelligence Tests

Authors: Niv Pekar, Yaniv Benny, Lior Wolf

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our algorithm is able not only to generate a set of plausible answers, but also to be competitive to the state of the art methods in multiple-choice tests. The experiments were conducted on the two regimes of the PGM dataset [10], neutral and interpolation as well as on the recently proposed RAVEN-FAIR dataset [1]. In order to evaluate the generated results, two different approaches were used: machine evaluation (using other recognition models), and human evaluation.
Researcher Affiliation	Collaboration	Niv Pekar Tel Aviv University nivpekar@mail.tau.ac.il Yaniv Benny Tel Aviv University yanivbenny@mail.tau.ac.il Lior Wolf Facebook AI Research and Tel Aviv University
Pseudocode	No	The paper describes the model architecture and equations but does not include explicit pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement or link indicating that the source code for their methodology is open-source or publicly available.
Open Datasets	Yes	In this work, we utilize two datasets: (i) the Procedurally Generated Matrices (PGM) [10] dataset... and (ii) the recently proposed RAVEN-FAIR [1] dataset
Dataset Splits	No	We train on the train set and evaluate on the test set for all datasets.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU or CPU models used for running the experiments.
Software Dependencies	No	The paper mentions 'The Adam optimizer is used' but does not specify version numbers for any software dependencies or libraries.
Experiment Setup	Yes	The Adam optimizer is used with a learning rate of 10 4. The margin hyper-parameter α (for the contrastive loss) is updated every 1000 iterations to be the mean measured distance between the choices images and the generated. The contrastive loss with respect to the target choice image was multiplied by 3 10 3, and the contrastive loss with respect to the negative choice image was multiplied by 10 4. The VAE losses were multiplied by 0.1 (with β of 4), and the auxiliary Cm loss was multiplied by 10. all other losses were not weighted. The CEN was trained for 5 epochs for the recognition pathway only, after which all subnetworks were trained for ten additional epochs.