Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multilevel Generative Samplers for Investigating Critical Phenomena

Authors: Ankur Singha, Elia Cellini, Kim A. Nicoli, Karl Jansen, Stefan KΓΌhn, Shinichi Nakajima

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that the effective sample size of Ri GCS is a few orders of magnitude higher than state-of-the-art generative model baselines in sampling configurations for 128 128 two-dimensional Ising systems.
Researcher Affiliation Academia 1BIFOLD, Germany, 2 Technische Universit at Berlin, Germany 3Universit a degli Studi di Torino, Italy, 4 INFN Torino, Italy, 5University of Bonn, Germany 6Helmholtz Institute for Radiation and Nuclear Physics (HISKP) 7Deutsches Elektronen-Synchrotron (DESY), Germany, 8RIKEN Center for AIP, Japan
Pseudocode Yes The pseudocodes provided in Algorithm 1 and Algorithm 2 describe the practical steps for training Ri GCS and for sampling from a trained Ri GCS, respectively.
Open Source Code Yes The code is available at https://github.com/mlneuralsampler/multilevel.
Open Datasets No The paper uses the two-dimensional Ising model, which is a theoretical model and not a publicly available dataset in the conventional sense. Configurations for this model are generated through simulation, rather than being loaded from a pre-existing data source.
Dataset Splits No The paper evaluates a simulated physical system (the Ising model) and does not use pre-split datasets for training, validation, or testing in the traditional machine learning context.
Hardware Specification Yes For all models (Ri GCS and the baselines), we used a single NVIDIA A100 GPU with 80 GB of memory.
Software Dependencies No The paper mentions using the ADAM optimizer and Pixel CNN architecture, but does not provide specific version numbers for software libraries or frameworks (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes We trained VANs for 50000 gradient updates (steps) with batch size 100, and HANs for 100000 gradient updates with batch size 1000. For Ri GCS, training is performed for a total of 3000 steps for each sequential (upscaled) target lattice. When training on a target lattice NL = N, the pretraining phase involves training at coarser levels as follows: 2000 steps for level L 2, 1500 steps for level L 4, and 1000 steps for all previous levels, except for the coarsest one which is always trained for 500 steps.