reproducibilityindex.ai

Detecting Individual Decision-Making Style: Exploring Behavioral Stylometry in Chess

Authors: Reid McIlroy-Young, Yu Wang, Siddhartha Sen, Jon Kleinberg, Ashton Anderson

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present a transformer-based approach to behavioral stylometry in the context of chess, where one attempts to identify the player who played a set of games. Our method operates in a few-shot classiﬁcation framework, and can correctly identify a player from among thousands of candidate players with 98% accuracy given only 100 labeled games.
Researcher Affiliation	Collaboration	Reid Mc Ilroy-Young University of Toronto reidmcy@cs.toronto.edu Russell Wang University of Toronto russell@cs.toronto.edu Siddhartha Sen Microsoft Research sidsen@microsoft.com Jon Kleinberg Cornell University kleinberg@cornell.edu Ashton Anderson University of Toronto ashton@cs.toronto.edu
Pseudocode	No	The paper describes its methodology in detail through text and diagrams (Figure 1, Figure 2, Figure 3), but it does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	Our supplement contains the code, the model ﬁles are too large for the supplement so will be released when the paper goes public
Open Datasets	Yes	We use chess games downloaded from Lichess, a popular open-source chess platform. Their game database contains over two billion games played by players ranging from beginners to the current world champion, Magnus Carlsen.
Dataset Splits	Yes	Finally, each player s games are randomly split into training games (80% of their games), reference games (10%) , and query games (10%).
Hardware Specification	Yes	The entire model was trained with 4 Tesla K80 GPUs
Software Dependencies	No	The paper mentions using a transformer and specific architectures (e.g., Vision Transformer), but it does not provide version numbers for any software dependencies like Python, PyTorch, TensorFlow, or specific libraries.
Experiment Setup	Yes	We randomly sample N M games as a batch, where N = 40 is number of players and M = 20 is number of games per player. In order to speed up training and perform batch computing, we randomly sample a 32-move sequence from each game and pad 0s to games with length less than 32 moves. The entire model was trained with 4 Tesla K80 GPUs and SGD optimizer with an initial learning rate of 0.01 and momentum of 0.9. The learning rate was reduced by half every 40K steps. Initial values of w and b for the similarity matrix were chosen to be (w, b) = (10, 5), and we used a smaller gradient scale of 0.01 to match the original GE2E model.