Lorentz Group Equivariant Neural Network for Particle Physics
Authors: Alexander Bogatskiy, Brandon Anderson, Jan Offermann, Marwah Roussi, David Miller, Risi Kondor
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The competitive performance of the network is demonstrated on a public classification dataset (Kasieczka et al., 2019) for tagging top quark decays given energy-momenta of jet constituents produced in proton-proton collisions. |
| Researcher Affiliation | Collaboration | 1Department of Physics, University of Chicago, Chicago, IL, U.S.A. 2Department of Computer Science, University of Chicago, Chicago, IL, U.S.A. 3Atomwise, San Francisco, CA, U.S.A. 4Enrico Fermi Institute, Chicago, IL, U.S.A. 5Department of Statistics, University of Chicago, Chicago, IL, U.S.A. 6Flatiron Institute, Simons Foundation, New York, NY, U.S.A. |
| Pseudocode | No | The paper describes the architecture and its components but does not provide structured pseudocode or an algorithm block. |
| Open Source Code | No | The paper does not provide a statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | We perform top tagging classification experiments using the LGN architecture and the publicly available reference dataset (Kasieczka et al., 2019). This dataset contains 1.2M training entries, 400k validation entries and 400k testing entries. URL https://zenodo.org/record/2603256. |
| Dataset Splits | Yes | This dataset contains 1.2M training entries, 400k validation entries and 400k testing entries. |
| Hardware Specification | Yes | The architecture was coded up using Py Torch and trained on two clusters with Ge Force RTX 2080 GPU s. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not specify a version number or other software dependencies with their versions. |
| Experiment Setup | Yes | For training, we performed a manual grid search. The main parameters are the number of CG layers, the highest irrep kept after each tensor product, and the numbers of channels. For top tagging, we found it sufficient to keep T(k,n) with k, n 2, which means that the highest irrep is the 9-dimensional T(2,2) and the remaining irreps are T(0,0), T(2,0), T(0,2), and T(1,1). There were 3 CG layers, and the numbers of channels were chosen as N(0) ch = 2, N(1) ch = 3, N(2) ch = 4, N(3) ch = 3. ... The MLP layer after the p-th CG layer had 3 hidden layers of width 2N(p) ch each and used the leaky Re LU activation function. The scalar function f in 2 was a learnable linear combination of 10 basis Lorentzian bell -shaped curves a+1/(1+c2x2) with learnable parameters a, b, c (each taking 10 values). The input 4-momenta were scaled by a factor of 0.005 to ensure that the mean values of the components of all activations would be order 1. All weights were initialized from the standard Gaussian distribution. |