reproducibilityindex.ai

Learning to Discover Efficient Mathematical Identities

Authors: Wojciech Zaremba, Karol Kurach, Rob Fergus

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show how these approaches enable us to derive complex identities, beyond reach of brute-force search, or human derivation. All code and evaluation data can be found at https://github.com/kkurach/math_learning.
Researcher Affiliation	Collaboration	Wojciech Zaremba Dept. of Computer Science Courant Institute New York Unviersity Karol Kurach Google Zurich & Dept. of Computer Science University of Warsaw Rob Fergus Dept. of Computer Science Courant Institute New York Unviersity
Pseudocode	No	The paper describes procedures and models but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code.
Open Source Code	Yes	All code and evaluation data can be found at https://github.com/kkurach/math_learning.
Open Datasets	Yes	We ﬁrst create a dataset of symbolic expressions, spanning the space of all valid expressions up to degree k. We then group them into clusters of equivalent expressions (using the numerical representation to check for equality), and give each cluster a discrete label 1 . . . C.
Dataset Splits	No	The paper states 'Each class is split 80/20 into train/test sets.' in Section 4.2 but does not explicitly mention a validation split.
Hardware Specification	Yes	Running on a 3Ghz 16-core Intel Xeon.
Software Dependencies	No	The paper does not provide specific version numbers for any ancillary software, libraries, or frameworks used in the experiments.
Experiment Setup	Yes	A vector a Rl, where l = 30 is used to represent each input variable. The weight matrix in the softmax classiﬁer has much larger ( 100) learning rate than the rest of the layers. We use dropout [13] as the network has a tendency to overﬁt and repeat exactly the same expressions for the next value of k. Thus, instead of training on exactly φ(b1) and φ(b2), we drop activations as we propagate toward the top of the tree (the same fraction for each depth), which encourages the RNN to capture more local structures.