Game Design for Eliciting Distinguishable Behavior
Authors: Fan Yang, Liu Leqi, Yifan Wu, Zachary Lipton, Pradeep K. Ravikumar, Tom M. Mitchell, William W. Cohen
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our approach empirically, showing that our designed games can successfully distinguish among players with different traits, outperforming manually-designed ones by a large margin. |
| Researcher Affiliation | Collaboration | 1Carnegie Mellon University 2 Google Inc. |
| Pseudocode | No | The paper does not include any explicitly labeled pseudocode blocks or algorithms. |
| Open Source Code | No | The paper does not contain any statement about making its source code publicly available, nor does it provide a link to a code repository. |
| Open Datasets | No | The paper states, 'We simulate 1000 data instances for train, and 100 each for validation and test.' This indicates the data was simulated by the authors and no public dataset or access information is provided. |
| Dataset Splits | Yes | We simulate 1000 data instances for train, and 100 each for validation and test. We use a model similar to the one used for q Z|X, except for the last layer, which now outputs a categorical distribution. The optimization is run for 20 epochs and five rounds with different random seed. Validation set is used to select the test set performance. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions various algorithms and methods (e.g., Recurrent Neural Network, Gumbel-max trick, stochastic gradient descent, Gumbel-softmax trick, Adam) but does not list any specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9). |
| Experiment Setup | Yes | The optimization is run for 20 epochs and five rounds with different random seed. Validation set is used to select the test set performance. ... In Table 3, classification accuracy on test set is shown at different noise level. We consider three designs here. As defined above, a baseline method which uses manually designed reward in Path, a Path environment with learned reward, and a Grid environment with both learned reward and transition. ... We consider three designs here. As defined above, a baseline method which uses manually designed reward in Path, a Path environment with learned reward, and a Grid environment with both learned reward and transition. ... The interaction model Ψ has noise parameter λ, which varies at 1, 1.5, and 2.5. |