reproducibilityindex.ai

From Perception to Programs: Regularize, Overparameterize, and Amortize

Authors: Hao Tang, Kevin Ellis

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply ROAP to two different domains (Fig. 1). Our CIFAR-MATH domain is a harder version of a classic proving ground for neural logic programming (Manhaeve et al., 2018), modified to include program synthesis. On it, we show that ROAP can synthesize arithmetic equations while at the same time learning to parse images into symbolic digits. Our 3D-Reconstruction domain involves synthesizing graphics programs that algebraically transform and combine neural geometric primitives, and can be used to decompose 3D shapes and infer missing geometry. Table 1. Experimental Results on CIFAR-MATH. Table 2. 2D results.
Researcher Affiliation	Academia	Hao Tang 1 Kevin Ellis 1 1Cornell University. Correspondence to: Hao Tang <haotang@cs.cornell.edu>.
Pseudocode	Yes	Figure 9. Differentiable execution model for a program sketch containing L lines of code. JαKθ(x) = Execθ(α, x, L + V ) execute program and extract output on line L + V
Open Source Code	No	The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	Our CIFAR-MATH domain is a harder version of a classic proving ground for neural logic programming (Manhaeve et al., 2018), modified to include program synthesis. On it, we show that ROAP can synthesize arithmetic equations while at the same time learning to parse images into symbolic digits. Last, we switch from MNIST to CIFAR-10, and do not tell the system that there are only 10 digits. because CIFAR-10 is more visually complex than MNIST. and so we choose the canonical Shape Net dataset (Chang et al., 2015).
Dataset Splits	No	For each arithmetic task we have 1e6 I/O pairs for each task for training and 1000 I/O pairs for each task for testing. The paper specifies training and testing sets, but does not explicitly mention a separate validation dataset split.
Hardware Specification	No	The paper mentions using an '18-layer Res Net backbone (He et al., 2016) as the image encoder with an MLP decoder, whose parameters collectively comprise θ' but does not provide any specific hardware details such as GPU models, CPU models, or memory specifications used for training or evaluation.
Software Dependencies	No	The paper mentions using 'Adam (Kingma & Ba, 2014) optimizer' and an '18-layer Res Net backbone (He et al., 2016)', but it does not specify version numbers for any software libraries, frameworks, or environments (e.g., PyTorch version, Python version, CUDA version).
Experiment Setup	Yes	We train models using the Adam (Kingma & Ba, 2014) optimizer with a learning rate equal to 3e-4 and ϵ =1e-5 for 20 epochs. The program length regularizer is not applied until halfway through training with a coefficient of λ =1e-4, which is multiplied into the program length before it is added to the rest of the loss. The temperature for gumbel softmax is set to 1 in the beginning and changed to 3 from epoch 15 to minimize the error gap from the continuous approximations of programs near the end of training.