Versatile Energy-Based Probabilistic Models for High Energy Physics

Authors: Taoli Cheng, Aaron C. Courville

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that EBMs are able to generate realistic event patterns and can be used as generic anomaly detectors free from spurious correlations. We also explore EBM-based hybrid modeling combining generative and discriminative models for HEP events. This unified learning framework paves for future-generation event simulators and automated new physics search strategies. It opens a door for combining different methods and components towards a powerful multi-tasking engine for scientific discovery at the Large Hadron Collider.
Researcher Affiliation Academia Taoli Cheng Mila University of Montreal chengtaoli.1990@gmail.com Aaron Courville Mila University of Montreal aaron.courville@umontreal.ca
Pseudocode Yes Algorithm 1 EBM training with MCMC by Langevin Dynamics
Open Source Code Yes To ensure reproducibility and encourage open-source practice in the HEP community, we release the code implementation in https://github.com/taolicheng/EBM-HEP.
Open Datasets Yes For the standard EBM, we train on 300,000 simulated QCD jets. ... QCD jets are extracted from QCD di-jet events that are generated with Mad Graph [4] for LHC 13 Te V, followed by Pythia8 [61] and Delphes [18] for parton shower and fast detector simulation. ... The training sets and test sets are accessible at [47, 14].
Dataset Splits No The paper does not explicitly describe a validation dataset split (e.g., 80/10/10 split) for hyperparameter tuning or model selection. It mentions "validation steps" for MCMC, which refers to a process rather than a distinct data split.
Hardware Specification No The paper does not specify the hardware used for running experiments (e.g., specific GPU models, CPU types, or cloud computing instances with detailed specifications).
Software Dependencies No The paper mentions software tools like Adam [44], Pythia8 [61], Mad Graph [4], Delphes [18], and Particle Net [59] but does not provide specific version numbers for these or other key software components (e.g., Python, PyTorch, TensorFlow versions) used in their setup for reproducibility.
Experiment Setup Yes The training set consists of 300,000 QCD jets. ... We use a relatively small number of steps (e.g., 24) for the MCMC chains. The step size λx is set to 0.1 ... The diffusion magnitude within the Langevin dynamics is set to 0.005. The number of steps used in validation steps is set to 128 ... We use Adam [44] for optimization, with the momenta β1 = 0.0 and β2 = 0.999. The initial learning rate is set to 1e-4, with a decay rate of 0.98 for each epoch. We use a batch size of 128, and train the model for 50 epochs. More details can be found in Appendix A. (Table 4 in Appendix A provides further details on hyper-parameters).