Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning to compute Gröbner bases
Authors: Hiroshi Kera, Yuki Ishihara, Yuta Kambe, Tristan Vaccon, Kazuhiro Yokoyama
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiments show that our dataset generation method is a few orders of magnitude faster than a naive approach, overcoming a crucial challenge in learning to compute Gröbner bases, and Gröbner computation is learnable in a particular class. |
| Researcher Affiliation | Collaboration | Hiroshi Kera Chiba University, Zuse Institute Berlin EMAIL; Yuki Ishihara Nihon University EMAIL; Yuta Kambe Mitsubishi Electric EMAIL; Tristan Vaccon Limoges University EMAIL; Kazuhiro Yokoyama Rikkyo University EMAIL |
| Pseudocode | Yes | The combination of the discussion in the previous sections gives an efficient dataset generation algorithm (see Alg. 1 for a pseudocode). |
| Open Source Code | Yes | The code is available at https://github.com/Hiroshi KERA/transformer-groebner. |
| Open Datasets | Yes | We constructed 16 datasets Dn(k) for n {2, 3, 4, 5} and k {F7, F31, Q, R} and measured the runtime of the forward generation and our backward generation. |
| Dataset Splits | No | The training set has one million samples, and the test set has one thousand samples. |
| Hardware Specification | Yes | All the experiments were conducted with 48-core CPUs, 768GB RAM, and NVIDIA RTX A6000ada GPUs. |
| Software Dependencies | No | We used Sage Math [82] with the lib Singular backend. |
| Experiment Setup | Yes | We used a Transformer model [85] with a standard architecture: 6 encoder/decoder layers, 8 attention heads, token embedding dimension of 512 dimensions, and feed-forward networks with 2048 inner dimensions. The absolute positional embedding is learned from scratch. The dropout rate was set to 0.1. We used the Adam W optimizer [65] with (β1, β2) = (0.9, 0.999) with no weight decay. The learning rate was initially set to 10 4 and then linearly decayed over training steps. All training samples are visited in a single epoch, and the total number of epochs was set to 8. The batch size was set to 16. |