reproducibilityindex.ai

Circuit-Based Intrinsic Methods to Detect Overfitting

Authors: Satrajit Chatterjee, Alan Mishchenko

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally, CFS can separate models with different levels of overﬁt using only their logic circuit representations without any access to the high level structure. We take a ﬁrst step towards answering this question by studying a naturally-motivated family of intrinsic methods, called Counterfactual Simulation (CFS), and evaluating their efﬁcacy experimentally on a benchmark problem.
Researcher Affiliation	Collaboration	Satrajit Chatterjee 1 Google, Mountain View, California, USA 2 Department of EECS, University of California, Berkeley, California, USA.
Pseudocode	No	The paper describes the Counterfactual Simulation (CFS) method in prose within Section 2, but does not provide structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement about releasing the source code for the methodology described, nor does it include a link to a code repository.
Open Datasets	Yes	To make this concrete, consider the MNIST image classiﬁcation problem (Le Cun & Cortes, 2010)... S is a sample from D of 60,000 images xi and their corresponding labels yi (thus, 0 i < 60000) i.e., the MNIST training set. We also performed some experiments with Fashion MNIST (Xiao et al., 2017) and the results are similar.
Dataset Splits	No	The paper mentions "validation set accuracy" (e.g., "nn-real-2 is the least overﬁt and gets to a validation set accuracy of 97%"), but it does not provide specific details on the dataset split (e.g., exact percentages or sample counts for the validation set) or how the validation set was created.
Hardware Specification	Yes	A typical run of l-CFS in our experiments takes less than 10 minutes on a 3.7GHz Xeon CPU and less than 2GB of RAM.
Software Dependencies	Yes	Two random forests were trained using version 0.19.1 of Scikitlearn (Pedregosa et al., 2011).
Experiment Setup	Yes	In all cases, we used the ADAM optimizer with default parameters and batch size of 64. Weights and activations are represented by signed 8-bit and 16-bit ﬁxed point numbers respectively with 6 bits reserved for the fractional part. (Weights from training are clamped to [ 2.0, 2.0) before conversion to ﬁxed point.) Each multiply-accumulate unit multiplies an 8-bit constant (the weight) with a 16-bit input (the activation) and accumulates in 24 bits with saturation.