Learning what you can do before doing anything
Authors: Oleh Rybkin, Karl Pertsch, Konstantinos G. Derpanis, Kostas Daniilidis, Andrew Jaegle
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show the applicability of our method to synthetic settings and its potential to capture action spaces in complex, realistic visual settings. When used in a semi-supervised setting, our learned representations perform comparably to existing fully supervised methods on tasks such as action-conditioned video prediction and planning in the learned action space, while requiring orders of magnitude fewer action labels. |
| Researcher Affiliation | Collaboration | 1University of Pennsylvania 2University of Southern California 3Ryerson University 4Samsung AI Centre Toronto |
| Pseudocode | Yes | Algorithm 1 Planning in the learned action space |
| Open Source Code | No | The paper provides a project website link (https://daniilidis-group.github.io/learned_action_spaces/) but explicitly states that this link is for "Additional generated videos" and does not contain a clear, affirmative statement that the source code for the paper's methodology is released or available there. |
| Open Datasets | Yes | We conduct experiments on a simple simulated reacher dataset and the real-world Berkeley AI Research (BAIR) robot pushing dataset from Ebert et al. (2017). |
| Dataset Splits | No | The paper specifies training and test sets but does not explicitly define a validation set or its size for either dataset. |
| Hardware Specification | No | All experiments were conducted on a single high-end NVIDIA GPU. (This description lacks specific model names or detailed specifications.) |
| Software Dependencies | No | The paper mentions using specific software components like "Adam optimizer" and states frameworks for baselines like "Oh et al. (2015) for the reacher dataset and the more complex Finn & Levine (2017) for the BAIR dataset," but it does not provide version numbers for any software libraries, dependencies, or tools used in their experiments. |
| Experiment Setup | Yes | For all experiments, we condition our model on five images and roll out ten future images. We use images with a resolution of 64 64 pixels. The dimension of the image representation is dim(g(x)) = 128, and the dimensions of the learned representation are dim(z) = dim(ν) = 10. ... The MLPinfer has two hidden layers with 256 and 128 units, respectively. The MLPcomp, MLPlat, and MLPact networks each have two hidden layers with 32 units. ... The number of latent samples z used to produce a trajectory representation ν is C = 4. For all datasets, βz = 10 2, βν = 10 8 We use the leaky Re LU activation function in the g, f, and MLP networks. We optimize the objective function using the Adam optimizer with parameters β1 = 0.9, β2 = 0.999 and a learning rate of 2 10 4. ... The parameters used for our visual servoing experiments are listed in Tab. 5. Servoing timesteps (T) 5 Servoing horizon (K) 5 # servoing sequences (M) 10 # refit sequences (M ) 3 # refit iterations (N) 4 |