Learning from Trajectories via Subgoal Discovery
Authors: Sujoy Paul, Jeroen Vanbaar, Amit Roy-Chowdhury
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform experiments on three goal-oriented tasks on Mu Jo Co [15] with sparse terminal-only reward, which state-of-the-art RL, IL or their combinations are not able to solve. |
| Researcher Affiliation | Collaboration | 1University of California-Riverside 2Mitsubishi Electric Research Laboratories (MERL) |
| Pseudocode | Yes | Algorithm 1 Learning Sub-Goal Prediction |
| Open Source Code | No | The paper does not include an unambiguous statement about releasing code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | We perform experiments on three challenging environments as shown in Fig. 2. First is Ballin-Maze Game (Bi MGame) introduced in [43]... The second environment is Ant Target which involves the Ant [44]... The third environment, Ant Maze, uses the same Ant, but in a U-shaped maze used in [35]. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | Details about the network architectures we use for πθ, πφ and fψ(s) can be found in the supplementary material. |