Compositional Law Parsing with Latent Random Functions
Authors: Fan Shi, Bin Li, Xiangyang Xue
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results demonstrate that CLAP outperforms the baseline methods in multiple visual tasks such as intuitive physics, abstract visual reasoning, and scene representation. The law manipulation experiments illustrate CLAP s interpretability by modifying specific latent random functions on samples. To evaluate the model s ability of compositional law parsing on different types of data, we use three datasets in the experiments: (1) Bouncing Ball (abbreviated as Bo Ba) dataset (Lin et al., 2020a) to validate the ability of intuitive physics; (2) Continuous Raven s Progressive Matrix (CRPM) dataset (Shi et al., 2021) to validate the ability of abstract visual reasoning; (3) MPI3D dataset (Gondal et al., 2019) to validate the ability of scene representation. We adopt NP (Garnelo et al., 2018b), GP with the deep kernel (Wilson et al., 2016), and GQN (Eslami et al., 2018) as baselines. |
| Researcher Affiliation | Academia | Fan Shi Bin Li Xiangyang Xue Shanghai Key Laboratory of Intelligent Information Processing School of Computer Science, Fudan University fshi22@m.fudan.edu.cn {libin,xyxue}@fudan.edu.cn |
| Pseudocode | No | The paper describes the model architecture and processes using text and mathematical equations, but it does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | 1Code is available at https://github.com/FudanVI/generative-abstract-reasoning/tree/main/clap |
| Open Datasets | Yes | To evaluate the model s ability of compositional law parsing on different types of data, we use three datasets in the experiments: (1) Bouncing Ball (abbreviated as Bo Ba) dataset (Lin et al., 2020a); (2) Continuous Raven s Progressive Matrix (CRPM) dataset (Shi et al., 2021); (3) MPI3D dataset (Gondal et al., 2019). ... One can fetch MPI3D from the repository 2 under the Creative Commons Attribution 4.0 International License. 2https://github.com/rr-learning/disentanglement_dataset |
| Dataset Splits | Yes | Referring to Table 2, we provide 10,000 videos of bouncing balls for training, 1,000 for validation and hyperparameter selection, and 2,000 to test the intuitive physics of the models. ... Table 2: The detail of Bo Ba, CRPM, and MPI3D. Row 1: dataset names. Row 2: splits of datasets where Test-k denotes that there are k target images in one sample. Row 3: the number of samples in each split. |
| Hardware Specification | Yes | We train our model and baselines on the server with Intel(R) Xeon(R) Gold 6133 CPUs, 24GB NVIDIA Ge Force RTX 3090 GPUs, 512GB RAM, and Ubuntu 18.04 OS. |
| Software Dependencies | No | All models are implemented with the Py Torch (Paszke et al., 2019) framework. No specific version numbers for PyTorch or other relevant software dependencies are provided. |
| Experiment Setup | Yes | Table 4: The hyperparameters of CLAP-NP. ... In all datasets, we set the learning rate as 3 10^4, batch size as 512, and σy = 0.1. ... CLAP-NP uses the Adam (Kingma & Ba, 2015) optimizer to update parameters. D.2 MODEL ARCHITECTURE AND HYPERPARAMETERS CLAP-NP In this subsection, we will first describe the architecture of the encoder, decoder, concept-specific function parsers, and concept-specific target predictors in CLAP-NP. |