Neural Language of Thought Models
Authors: Yi-Fu Wu, Minseung Lee, Sungjin Ahn
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate NLo TM on several 2D and 3D image datasets, demonstrating superior performance in downstream tasks, out-of-distribution generalization, and image generation quality compared to patch-based VQ-VAE and continuous object-centric representations. |
| Researcher Affiliation | Academia | Yi-Fu Wu1, Minseung Lee2, Sungjin Ahn2 1Rutgers University 2KAIST |
| Pseudocode | No | The paper provides architectural diagrams and mathematical formulations, but it does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | We will also release the source code upon acceptance of the paper. |
| Open Datasets | Yes | We evaluate our model on two variants of a 2D Sprites dataset (Watters et al., 2019a; Yoon et al., 2023) and three variants of the CLEVR dataset (Johnson et al., 2017), CLEVR-Easy, CLEVR-Hard, CLEVR-Tex. |
| Dataset Splits | Yes | Since all models can solve the task when evaluated on the ID dataset, we report the number of steps to reach 98% accuracy on the validation dataset. |
| Hardware Specification | Yes | Each model is trained on NVIDIA Quadro RTX 8000 GPUs with 48GB memory and we use half-precision floating-point format. |
| Software Dependencies | No | The paper mentions using the Adam optimizer and Pixel CNN but does not specify versions for core software libraries like Python, PyTorch/TensorFlow, or other dependencies. |
| Experiment Setup | Yes | Table 11 shows the hyperparameters we used for the different datasets in our experiments with SVQ. For the d VAE and Transformer Decoder, we follow the hyperparameters, architecture, and training procedure provided in Singh et al. (2023) for CLEVR-Easy and CLEVR-Hard. |