reproducibilityindex.ai

Machine learning detects terminal singularities

Authors: Tom Coates, Alexander Kasprzyk, Sara Veneziale

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper we demonstrate that machine learning can be used to understand this classiﬁcation. We focus on eight-dimensional positively-curved algebraic varieties that have toric symmetry and Picard rank two, and develop a neural network classiﬁer that predicts with 95% accuracy whether or not such an algebraic variety is Q-Fano. We trained a feed-forward neural network classiﬁer on a balanced dataset of 5 million examples; these are eight-dimensional Q-factorial Fano toric varieties of Picard rank two, of which 2.5 million are terminal and 2.5 million non-terminal. Testing on a further balanced dataset of 5 million examples showed that the neural network classiﬁes such toric varieties as terminal or non-terminal with an accuracy of 95%.
Researcher Affiliation	Academia	Tom Coates Department of Mathematics Imperial College London 180 Queen s Gate London, SW7 2AZ UK t.coates@imperial.ac.uk Alexander M. Kasprzyk School of Mathematical Sciences University of Nottingham Nottingham, NG7 2RD UK a.m.kasprzyk@nottingham.ac.uk Sara Veneziale Department of Mathematics Imperial College London 180 Queen s Gate London, SW7 2AZ UK s.veneziale21@imperial.ac.uk
Pseudocode	Yes	Algorithm 1 Test terminality for weight matrix W = [[a1, . . . , a N], [b1, . . . , b N]].
Open Source Code	Yes	All code used and trained models are available from Bit Bucket under an MIT licence [12]. Supporting code. https://bitbucket.org/ fanosearch/ml_terminality, 2023.
Open Datasets	Yes	The datasets underlying this work and the code used to generate them are available from Zenodo under a CC0 license [11]. A dataset of 8-dimensional Q-factorial Fano toric varieties of Picard rank 2. Zenodo, 2023. doi:10.5281/zenodo.10046893.
Dataset Splits	Yes	We tested the model on a balanced subset of 50% of the data (5M); the remainder was used for training (40%; 4M balanced) and validation (10%; 1M).
Hardware Specification	No	The paper mentions "30 CPU years" for data generation and "120 CPU hours" for ML-assisted generation, but it does not provide specific hardware details such as CPU models, GPU models, or memory specifications.
Software Dependencies	Yes	Data generation and post-processing was carried out using the computational algebra system Magma V2.27-3 [5]. The machine learning model was built using Py Torch v1.13.1 [36] and scikit-learn v1.1.3 [37]. Hyperparameter tuning was partly carried out using Ray Tune [31].
Experiment Setup	Yes	The final best network conﬁguration is summarised in Table 1. Table 1: Final network architecture and conﬁguration. Hyperparameter Value Hyperparameter Value Layers (512, 768, 512) Momentum 0.99 Batch size 128 Leaky Relu slope 0.01 Initial learning rate 0.01