Smooth Imitation Learning for Online Sequence Prediction
Authors: Hoang Le, Andrew Kang, Yisong Yue, Peter Carr
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical results demonstrate significant performance gains over previous approaches. 6. Experiments Automated Camera Planning. We evaluate SIMILE in a case study of automated camera planning for sport broadcasting (Chen & Carr, 2015; Chen et al., 2016). Summary of Results. Using our smooth policy class leads to dramatically smoother trajectories than not regularizing using H. Using our adaptive learning rate leads to much faster convergence compared to conservative learning rates from SEARN (Daum e III et al., 2009). Using smooth feedback ensures stable learning of smooth policies at each iteration. Deterministic policy interpolation performs better than stochastic interpolation used in SEARN. |
| Researcher Affiliation | Collaboration | Hoang M. Le HMLE@CALTECH.EDU Andrew Kang AKANG@CALTECH.EDU Yisong Yue YYUE@CALTECH.EDU California Institute of Technology, Pasadena, CA, USA Peter Carr PETER.CARR@DISNEYRESEARCH.COM Disney Research, Pittsburgh, PA, USA |
| Pseudocode | Yes | Algorithm 1 SIMILE (Smooth IMItation LEarning) |
| Open Source Code | Yes | Access data at http://www.disneyresearch.com/ publication/smooth-imitation-learning/ and code at http://github.com/hoangminhle/SIMILE. |
| Open Datasets | Yes | Access data at http://www.disneyresearch.com/ publication/smooth-imitation-learning/ and code at http://github.com/hoangminhle/SIMILE. |
| Dataset Splits | No | The paper mentions using a 'new dataset Dn tpst, pan t qu' for training and refers to 'training data' but does not specify explicit train/validation/test dataset splits, percentages, or sample counts for reproducibility. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as CPU or GPU models, memory, or cloud instance specifications. |
| Software Dependencies | No | The paper mentions 'regression tree ensembles F' and references related work like 'Decision forests', but does not list specific software components with their version numbers (e.g., Python 3.x, TensorFlow x.x, PyTorch x.x) necessary for replication. |
| Experiment Setup | No | While the paper describes an 'adaptive learning rate', it does not provide concrete hyperparameter values such as initial learning rate, batch size, number of epochs, or optimizer settings, nor a dedicated section on experimental setup details. |