Unveiling Concepts Learned by a World-Class Chess-Playing Agent

Authors: Aðalsteinn Pálsson, Yngvi Björnsson

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments need an external dataset to generate the concept probes. For that we use a dataset generated by Leela Chess Zero that is listed as a quality dataset (training_data at [Stockfish, 2022d]), from which we randomly sampled 100k positions.
Researcher Affiliation Academia Aðalsteinn Pálsson , Yngvi Björnsson Department of Computer Science, Reykjavik University
Pseudocode No No pseudocode or algorithm blocks were found in the paper.
Open Source Code No The paper mentions Stockfish is open-source and provides links to its general project pages (e.g., https://stockfishchess.org/ and https://github.com/glinscott/nnue-pytorch/blob/master/docs/nnue.md), but it does not explicitly state that the code for the interpretability methods developed or used in this specific paper is available or open-sourced.
Open Datasets Yes For that we use a dataset generated by Leela Chess Zero that is listed as a quality dataset (training_data at [Stockfish, 2022d]), from which we randomly sampled 100k positions. and [Stockfish, 2022d] Stockfish. Training datasets. https://github.com/glinscott/nnue-pytorch/wiki/ Training-datasets, 2022. Accessed: 2022-01-30.
Dataset Splits Yes The error bars of Figures 6 and 7 show the standard error of the mean of cross-validation results over five splits.
Hardware Specification No No specific hardware details (e.g., GPU models, CPU types, or memory specifications) used for running the experiments were provided in the paper.
Software Dependencies No The paper mentions 'version 14.1 of Stockfish' and 'ridge regression' but does not provide a comprehensive list of specific software dependencies with version numbers (e.g., Python, PyTorch, scikit-learn versions) required to reproduce the experiments.
Experiment Setup Yes For each probe, we perform a hyperparameter search over alpha values (the L2 term multiplier) of [0.01, 0.1, 0.5, 1, 5, 10, 50, 100, 500, 1000].