Learning Sparse Representations in Reinforcement Learning with Sparse Coding
Authors: Lei Le, Raksha Kumaraswamy, Martha White
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show that it is key to use a supervised objective, rather than the more straightforward unsupervised sparse coding approach. We compare the learned representations to a canonical fixed sparse representation, called tile-coding, demonstrating that the sparse coding representation outperforms a wide variety of tilecoding representations. ... We conducted experiments in three benchmark RL domains Mountain Car, Puddle World and Acrobot. |
| Researcher Affiliation | Academia | Dept. of Computer Science Indiana University Bloomington, IN, USA leile@indiana.edu |
| Pseudocode | No | The paper describes the algorithm steps in paragraph text and equations but does not include a formally labeled "Pseudocode" or "Algorithm" block. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology. |
| Open Datasets | Yes | We conducted experiments in three benchmark RL domains Mountain Car, Puddle World and Acrobot [Sutton, 1996]. |
| Dataset Splits | Yes | For learning the SCo PE representations, regularization parameters were chosen using 5-fold cross-validation on 5000 training samples, with βφ = 0.1 fixed to give a reasonable level of sparsity. |
| Hardware Specification | No | The paper does not specify any details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions general software components like "Python" but does not list specific software dependencies with version numbers. |
| Experiment Setup | Yes | The regularization weights βB are chosen from {1 5, . . . , 1 1, 0}, based on lowest cumulative error. For convenience, βw is fixed to be the same as βB. For learning the SCo PE representations, regularization parameters were chosen using 5-fold cross-validation on 5000 training samples, with βφ = 0.1 fixed to give a reasonable level of sparsity. ... The dimension k = 100 is set to be smaller than for tile coding |