Transformers as Game Players: Provable In-context Game-playing Capabilities of Pre-trained Models
Authors: Chengshuai Shi, Kun Yang, Jing Yang, Cong Shen
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Furthermore, experiments are also performed to practically test the ICGP capabilities of the pretrained transformers. The obtained results not only corroborate the derived theoretical claims, but also empirically motivate this and further studies on the interesting direction of pre-trained models in game-theoretic settings. |
| Researcher Affiliation | Academia | Chengshuai Shi University of Virginia cs7ync@virginia.edu Kun Yang University of Virginia ky9tc@virginia.edu Jing Yang The Pennsylvania State University yangjing@psu.edu Cong Shen University of Virginia cong@virginia.edu |
| Pseudocode | Yes | Algorithm 1 VI-ULCB; Algorithm 2 MWU; Algorithm 3 V-learning [32]; Algorithm 4 V-learning Executing Output Policy ˆµ [32] |
| Open Source Code | Yes | The experimental codes are available at https://github.com/Shen Group/ICGP. |
| Open Datasets | No | The paper describes generating its own experimental data: 'At the start of each game (during both training and inference), a A B reward matrix Rh(s, , ) is generated for each step h [H] and state s S with its elements independently sampled from a standard Gaussian distribution truncated on [0, 1].' |
| Dataset Splits | No | The paper does not explicitly provide training/test/validation dataset splits. It mentions 'pre-trained with N = 10 games' and 'pre-trained with N = 20 games' and evaluates on 'inference tasks', but no specific train/validation/test splits are detailed. |
| Hardware Specification | Yes | For the training procedure, we use one Nvidia 6000 Ada to train the transformer with a batch size of 32, trained for 100 epochs, and we set the learning rate as 5 10 4. |
| Software Dependencies | No | The paper states 'The transformer architecture employed in our experiments is primarily based on the well-known GPT-2 model [43], and our implementation follows the mini GPT realization3 for simplicity.' It does not specify version numbers for GPT-2 or other software dependencies like Python or PyTorch. |
| Experiment Setup | Yes | For the training procedure, we use one Nvidia 6000 Ada to train the transformer with a batch size of 32, trained for 100 epochs, and we set the learning rate as 5 10 4. |