Searching for Higgs Boson Decay Modes with Deep Learning
Authors: Peter J Sadowski, Daniel Whiteson, Pierre Baldi
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we train artificial neural networks to detect the decay of the Higgs boson to tau leptons on a dataset of 82 million simulated collision events. We demonstrate that deep neural network architectures are particularly well-suited for this task with the ability to automatically discover high-level features from the data and increase discovery significance. |
| Researcher Affiliation | Academia | Peter Sadowski Department of Computer Science University of California, Irvine Irvine, CA 92617 peter.j.sadowski@uci.edu; Pierre Baldi Department of Computer Science University of California, Irvine Irvine, CA 92617 pfbaldi@ics.uci.edu; Daniel Whiteson Department of Physics and Astronomy University of California, Irvine Irvine, CA 92617 Address daniel@uci.edu |
| Pseudocode | No | The paper describes methods in text but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or links to a code repository. |
| Open Datasets | No | The paper states it uses 'a dataset of 82 million simulated collision events' and references 'simulated collisions from sophisticated Monte Carlo programs [4, 5, 6]', but it does not provide concrete access information (link, DOI, repository, or formal citation for a specific public dataset instance) for this simulated data. |
| Dataset Splits | Yes | A validation set of 1 million examples was randomly set aside for tuning the hyperparameters. |
| Hardware Specification | Yes | Computations were performed using machines with 16 Intel Xeon cores, an NVIDIA Tesla C2070 graphics processor, and 64 GB memory. |
| Software Dependencies | No | Training was performed using the Theano and Pylearn2 software libraries [9, 10]. However, specific version numbers for these libraries are not provided. |
| Experiment Setup | Yes | The tanh activation function was used for all hidden units, while the the logistic function was used for the output. Weights were initialized from a normal distribution with zero mean and standard deviation 0.1 in the first layer, 0.001 in the output layer, and 1/sqrt(k) for all other hidden layers, where k was the number of units in the previous layer. Gradient computations were made on mini-batches of size 100. A momentum term increased linearly over the first 25 epochs from 0.5 to 0.99, then remained constant. The learning rate decayed by a factor of 1.0000002 every batch update until it reached a minimum of 10^-6. All networks were trained for 50 epochs. |