reproducibilityindex.ai

Predictive Coding beyond Correlations

Authors: Tommaso Salvatori, Luca Pinchetti, Amine M’Charrak, Beren Millidge, Thomas Lukasiewicz

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we show how such findings can be used to improve the performance of predictive coding in image classification tasks, and conclude that such models are able to perform simple end-to-end causal inference tasks.
Researcher Affiliation	Collaboration	1VERSES Research Lab, Los Angeles, CA 90016, USA 2Institute of Logic and Computation, Vienna University of Technology, Austria 3Department of Computer Science, University of Oxford, UK 4MRC Brain Network Dynamics Unit, University of Oxford, UK 5Zyphra, Palo Calto, CA, USA. Correspondence to: Tommaso Salvatori <tommaso.salvatori@verses.ai>.
Pseudocode	Yes	We provide the pseudocode of the training process on PC graphs in Algorithm 1.
Open Source Code	No	The paper does not provide an explicit statement about releasing source code or a direct link to a repository for the methodology described.
Open Datasets	Yes	We then show how interventionl queries can be used to improve the test accuracy of PC graphs on MNIST and Fashion MNIST. and The classification experiments are performed on the MNIST, Fashion MNIST, and 2-MNIST datasets.
Dataset Splits	No	The paper mentions training data and test data, e.g., 'We use observational training data, X, to fit the PC model.' and 'We evaluate the learned SCM by comparing various difference metrics between true and inferred counterfactual values.' However, it does not provide specific percentages or counts for training/validation/test splits, nor does it explicitly define a validation set.
Hardware Specification	No	The paper does not specify any particular hardware used for running experiments, such as GPU or CPU models.
Software Dependencies	No	The paper mentions optimizers like 'vanilla stochastic gradient descent (SGD)' and 'Adam W optimizer' but does not provide version numbers for any software libraries or frameworks used in the implementation.
Experiment Setup	Yes	The PC graph is trained with 3000 samples for 1000 epochs with a batch size of 128. We use the vanilla stochastic gradient descent (SGD) optimizer for the node values with a learning rate of γ = 3e 3 and T = 8 iterations for inference of node values during training and testing. For the weights, we use the Adam W optimizer with a learning rate of α = 8e 3 and a weight decay of λw = 1e 4.