reproducibilityindex.ai

Game Solving with Online Fine-Tuning

Authors: Ti-Rong Wu, Hung Guei, Ting Han Wei, Chung-Chin Shih, Jui-Te Chin, I-Chen Wu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that using online fine-tuning can solve a series of challenging 7x7 Killall-Go problems, using only 23.54% of computation time compared to the baseline without online fine-tuning.
Researcher Affiliation	Academia	1Institute of Information Science, Academia Sinica, Taiwan 2Department of Computing Science, University of Alberta, Canada 3Department of Computer Science, National Yang Ming Chiao Tung University, Taiwan 4Research Center for Information Technology Innovation, Academia Sinica, Taiwan
Pseudocode	No	No explicit pseudocode or algorithm blocks were found in the paper.
Open Source Code	Yes	Our code is available at https://rlg.iis.sinica.edu.tw/papers/neurips2023-online-fine-tuning-solver.
Open Datasets	No	The paper describes using '16 challenging 7x7 Killall-Go three-move openings' selected based on expert recommendations and PCN policy head output, but does not provide a link, DOI, or a formal citation for public access to this specific set of openings as a dataset.
Dataset Splits	No	The paper does not provide specific training, validation, and test dataset splits (e.g., percentages, sample counts, or references to predefined splits) to reproduce data partitioning.
Hardware Specification	Yes	All experiments are conducted in three machines, each equipped with two Intel Xeon E5-2678 v3 CPUs, 192G RAM, and four GTX 1080Ti GPUs.
Software Dependencies	No	The paper mentions using Alpha Zero training framework, Gumbel Alpha Zero algorithm, and PCN, but does not provide specific version numbers for these or other ancillary software components like programming languages or libraries.
Experiment Setup	Yes	During optimization, the learning rate is fixed at 0.02, and the batch size is set to 1,024. The PCN is optimized for 500 steps for every 2,000 self-play games.