Scalable Spatial Memory for Scene Rendering and Navigation
Authors: Wen-Cheng Chen, Chu-Song Chen, Wei-Chen Chiu, Min-Chun Hu
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results reveal the generalization ability of our proposed SMN in large-scale scene synthesis and its potential to improve the performance of spatial RL tasks. |
| Researcher Affiliation | Academia | 1National Cheng Kung University 2National Taiwan University 3National Yang Ming Chiao Tung University 4National Tsing Hua University |
| Pseudocode | No | No explicit pseudocode or algorithm blocks were found in the paper. The methodology is described through text, equations, and figures. |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | For small scenes, we adopted Rooms-Free-Camera (RFC) dataset proposed in GQN (Eslami et al. 2018). we design and develop a 3D platform named Orario3d based on pyrender (Matl 2019) to procedurally generate 3D mazes with different structures and textures (as shown in Fig. 5a). |
| Dataset Splits | No | The paper describes "evaluation sets" (Local, Base, Large) that are sampled differently from the training data for testing generalization, but does not specify a distinct "validation" split with percentages or counts for hyperparameter tuning from the main datasets. For example, it states "For Local, we randomly sample 5 observation poses and 10 query poses in the 6x6 grids area of each 11x11 grids maze, which is similar to the sampling strategy of the training data." and "For Base, we randomly sample 10 observation poses and 10 query poses in overall 11x11 grids maze." These are test/evaluation data rather than validation splits. |
| Hardware Specification | Yes | All the experiments are conducted on a PC with an Intel Core i7-9700K CPU and NVIDIA Geforce GTX1080Ti GPU. |
| Software Dependencies | No | The paper mentions 'pyrender (Matl 2019)' and optimizers like 'Adam optimizer' and 'RMSprop optimizer', but does not specify version numbers for these or other software dependencies. |
| Experiment Setup | Yes | The density of the memory blocks is set to (10 blocks)/m3, and the clipping space is set to a (6m)3 cube with the agent at the center of the cube (averagely 2160 memory blocks in the clipping range). The batch size and the number of training steps are set as 32 and 1.6M, respectively. We use the Adam optimizer to train the network with a learning rate of 5e-5. We stack the 5 previous frames to construct the input state of the image encoder in the DQN model. 1000 memory blocks are randomly sampled in the clipping range and taken as the local map of the SMN-DQN model. The discount factor is set to 0.95 and the batch size for training is set to 32. We train the RL models by using RMSprop optimizer with a learning rate of 2e-4 and 1M training steps. |