Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step Inverse Models
Authors: Alex Lamb, Riashat Islam, Yonathan Efroni, Aniket Rajiv Didolkar, Dipendra Misra, Dylan J Foster, Lekan P Molu, Rajan Chari, Akshay Krishnamurthy, John Langford
TMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the discovery of the control-endogenous latent state in three domains: localizing a robot arm with distractions (e.g., changing lighting conditions and background), exploring a maze alongside other agents, and navigating in the Matterport house simulator. |
| Researcher Affiliation | -1 | Anonymous authors Paper under double-blind review. The paper does not provide clear institutional affiliations or email domains for the authors, as it is under double-blind review. |
| Pseudocode | Yes | Algorithm 1 AC-State Algorithm for Latent State Discovery Using A Uniform Random Policy |
| Open Source Code | No | The text does not contain an explicit statement by the authors releasing their code, nor does it provide a direct link to a code repository for the methodology described in this paper. While a third-party library 'vector-quantize-pytorch' is referenced, this is not the authors' own implementation code. |
| Open Datasets | Yes | We evaluated AC-State on the matterport simulator introduced in Chang et al. (2017). |
| Dataset Splits | No | The paper mentions collecting "14,000 samples" for the robot arm, "3,000 training samples" for the maze, and a "20,000 sample dataset" for Matterport, but does not specify any explicit training, validation, or test splits for these datasets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running its experiments, such as exact GPU/CPU models, processor types, or memory amounts. |
| Software Dependencies | No | The paper mentions various models and optimizers, such as MLP-Mixer (Tolstikhin et al., 2021), Adam optimizer (Diederik et al., 2014), and VQ-VAE (van den Oord et al., 2017), but does not provide specific version numbers for software libraries or programming environments like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | The model is trained end-to-end with the AC-State objective with maximum horizon of K = 5 for 20 epochs using the Adam optimizer with a learning rate of 1e-4. We use a 6-layer transformer with 256 dimensions in the embedding. We set the FFN dimension D to 512. We use 4 heads in the ViT. |