reproducibilityindex.ai

Game Design for Eliciting Distinguishable Behavior

Authors: Fan Yang, Liu Leqi, Yifan Wu, Zachary Lipton, Pradeep K. Ravikumar, Tom M. Mitchell, William W. Cohen

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our approach empirically, showing that our designed games can successfully distinguish among players with different traits, outperforming manually-designed ones by a large margin.
Researcher Affiliation	Collaboration	1Carnegie Mellon University 2 Google Inc.
Pseudocode	No	The paper does not include any explicitly labeled pseudocode blocks or algorithms.
Open Source Code	No	The paper does not contain any statement about making its source code publicly available, nor does it provide a link to a code repository.
Open Datasets	No	The paper states, 'We simulate 1000 data instances for train, and 100 each for validation and test.' This indicates the data was simulated by the authors and no public dataset or access information is provided.
Dataset Splits	Yes	We simulate 1000 data instances for train, and 100 each for validation and test. We use a model similar to the one used for q Z\|X, except for the last layer, which now outputs a categorical distribution. The optimization is run for 20 epochs and ﬁve rounds with different random seed. Validation set is used to select the test set performance.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies	No	The paper mentions various algorithms and methods (e.g., Recurrent Neural Network, Gumbel-max trick, stochastic gradient descent, Gumbel-softmax trick, Adam) but does not list any specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9).
Experiment Setup	Yes	The optimization is run for 20 epochs and ﬁve rounds with different random seed. Validation set is used to select the test set performance. ... In Table 3, classiﬁcation accuracy on test set is shown at different noise level. We consider three designs here. As deﬁned above, a baseline method which uses manually designed reward in Path, a Path environment with learned reward, and a Grid environment with both learned reward and transition. ... We consider three designs here. As deﬁned above, a baseline method which uses manually designed reward in Path, a Path environment with learned reward, and a Grid environment with both learned reward and transition. ... The interaction model Ψ has noise parameter λ, which varies at 1, 1.5, and 2.5.