Systematic improvement of neural network quantum states using Lanczos

Authors: Hongwei Chen, Douglas Hendry, Phillip Weinberg, Adrian Feiguin

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate these ideas with an application to the Heisenberg J1 J2 model on the square lattice, a paradigmatic problem under debate in condensed matter physics, and achieve state-of-the-art accuracy in the representation of the ground state. In Sec.4 we present results of state-of-the-art calculations for the J1 J2 Heisenberg model on the square lattice and compare to other numerical techniques.
Researcher Affiliation Academia 1Department of Physics, Northeastern University, Boston, USA 2Stanford Institute for Materials and Energy Sciences, Stanford University, Stanford, USA 3Linac Coherent Light Source, SLAC National Accelerator Laboratory, Menlo Park, USA
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Source code will be available at: https://github.com/hwchen2017/Lanczos_Neural_Network_Quantum_State.
Open Datasets No The paper uses the J1 J2 Heisenberg model, which is a common benchmark in condensed matter physics, and compares results to exact diagonalization and QMC. While these are established models, the paper does not provide explicit access information (link, DOI, citation with author/year) to specific datasets used for training in the sense of a machine learning dataset. It describes the model and calculation, but not a public 'dataset' in a directly consumable format.
Dataset Splits No The paper states: 'For each training step, we collect 10000 samples to evaluate averages...' and 'As for evaluation, we collect 2 * 10^5 samples to calculate the average and statistical error.' However, it does not specify explicit dataset splits (e.g., percentages or counts) for distinct training, validation, and test sets. The sampling is part of the optimization process rather than a static data split.
Hardware Specification Yes All simulations are performed using Eigen and Intel MKL on Intel E5-2680v4 and AMD Rome 7702 CPU nodes.
Software Dependencies No The paper mentions 'Eigen and Intel MKL' as software used, but it does not provide specific version numbers for these libraries or any other software dependencies.
Experiment Setup Yes The parameters W in the RBM are initialized to be randomly chosen random numbers with a uniform distribution between [ 0.01, 0.01] for both real and imaginary parts. Due to a large number of parameters and the numerical noise in sampling, we implement the conjugate gradient method to solve the system of equations, Eq.(12). To stabilize the method, we introduce a ridge parameter λ = 10^-6. For each training step, we collect 10000 samples to evaluate averages as mentioned in Sec. 3.2 including the variational energy and log derivatives. Since the adjacent states in the Markov chain are highly correlated, the number of the skipped states between samples Nskip is chosen according to this relation Nskip = 5 * 1.0/r, where r is the acceptance rate in the previous training step. The typical value for Nskip is from 30 to 100. As for evaluation, we collect 2 * 10^5 samples to calculate the average and statistical error. The learning rate used in the training ranges from 5 * 10^-4 to 3 * 10^-2. Once we observe that variational energy is not decreasing, a smaller learning rate(half of the previous one) is used instead.