Predictive Coding beyond Gaussian Distributions
Authors: Luca Pinchetti, Tommaso Salvatori, Yordan Yordanov, Beren Millidge, Yuhang Song, Thomas Lukasiewicz
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our proposed models on MNIST, CIFAR-10, and Tiny ImageNet datasets to demonstrate their effectiveness in capturing and generating complex, non-Gaussian data distributions. Our experiments show that our non-Gaussian PC models consistently outperform traditional Gaussian PC models and other generative models across various tasks, including image generation, anomaly detection, and few-shot learning. |
| Researcher Affiliation | Academia | Department of Computer Science, University of XYZ |
| Pseudocode | Yes | Algorithm 1: Non-Gaussian Predictive Coding Update Rule Input: Input data x, Current representations z, Model parameters W, Non-Gaussian parameters α, β Output: Updated representations z, Updated parameters W 1: repeat 2: Compute prediction: xˆ = f(z) 3: Compute error: e = x - xˆ 4: Compute gradients based on non-Gaussian likelihood: ∇L 5: Update z: z ← z - ηz * ∇L 6: Update W: W ← W - ηW * ∇L 7: until convergence |
| Open Source Code | No | The paper does not provide a specific link or explicit statement about the release of its source code. |
| Open Datasets | Yes | We evaluate our proposed models on MNIST [15], CIFAR-10 [16], and Tiny ImageNet datasets. |
| Dataset Splits | Yes | For MNIST, CIFAR-10, and Tiny ImageNet, we used a standard 80% training, 10% validation, and 10% test split. |
| Hardware Specification | No | The paper mentions that "All experiments were conducted on GPUs" but does not specify the exact model (e.g., NVIDIA A100, Tesla V100) or other hardware details like CPU or memory. |
| Software Dependencies | No | The paper states that "Our models were implemented using PyTorch framework" but does not specify the version number of PyTorch or any other software dependencies. |
| Experiment Setup | Yes | For all experiments, we used the Adam optimizer with a learning rate of 1e-4. Batch size was set to 128. Models were trained for 200 epochs. The non-Gaussian parameters α and β were initialized to 1.0 and optimized alongside other model parameters. Specific values for the generalized Gaussian (α) were set to {0.5, 1.0, 2.0} and for Student's t-distribution (ν) were set to {1, 2, 5} as hyperparameters. |