reproducibilityindex.ai

Locality Sensitive Sparse Encoding for Learning World Models Online

Authors: Zichen Liu, Chao Du, Wee Sun Lee, Min Lin

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate the representation power of our encoding and verify that it allows efficient online learning under data covariate shift. We also show, in the Dyna MBRL setting, that our world models learned online using a single pass of trajectory data either surpass or match the performance of deep world models trained with replay and other continual learning methods. and EMPIRICAL RESULTS ON SUPERVISED LEARNING and EMPIRICAL RESULTS ON REINFORCEMENT LEARNING
Researcher Affiliation	Collaboration	Sea AI Lab National University of Singapore
Pseudocode	Yes	Algorithm 1 Online model learning and Algorithm 2 Sparse online model learning and Algorithm 3 Dyna MBRL with Losse-FTL
Open Source Code	Yes	We provide Jax-based Python codes in Listing 1.
Open Datasets	Yes	We test different encoding methods on an image denoising task, where the inputs are MNIST (Deng, 2012) images with Gaussian noise, and the outputs are clean images. and For continuous control tasks from Gym Mujoco (Todorov et al., 2012)
Dataset Splits	Yes	We add pixel-wise Gaussian noise with µ = 0, σ = 0.3 to normalized images from the MNIST dataset (Deng, 2012), and create train and test splits with ratio 9 : 1. and Before the training loop starts, the data is split into train and holdout sets with a ratio of 4:1. The loss on the holdout set is measured after each epoch. We stop training if the loss does not improve over 5 consecutive epochs.
Hardware Specification	No	No specific hardware details (e.g., CPU, GPU models, or memory) used for experiments are mentioned in the paper.
Software Dependencies	No	The paper mentions 'Jax-based Python codes' in Appendix E and imports JAX in Listing 1, but does not provide specific version numbers for Python, JAX, or any other software dependencies.
Experiment Setup	Yes	The linear layer is optimized using Adam (Kingma & Ba, 2015) with a learning rate 0.0001. and Adam optimizer is used with the best learning rate swept from {5 10 6, 1 10 5, 5 10 5, . . . , 1 10 2}. and For Losse-FTL, we use 2-d binning with 10 bins for each grid and use 10 grids to construct the final feature. and We use κ = 30, ρ = 2, and λ = 10 for Losse-FTL, and 3-layer MLPs with 32 hidden units for neural networks. and The DQN parameters are updated using real data with an interval of 4 interactions. In MBRL, 16 planning steps are conducted after each real data update. Both real data updates and planning updates use a mini batch of 32.