Smooth Imitation Learning for Online Sequence Prediction

Authors: Hoang Le, Andrew Kang, Yisong Yue, Peter Carr

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical results demonstrate significant performance gains over previous approaches. 6. Experiments Automated Camera Planning. We evaluate SIMILE in a case study of automated camera planning for sport broadcasting (Chen & Carr, 2015; Chen et al., 2016). Summary of Results. Using our smooth policy class leads to dramatically smoother trajectories than not regularizing using H. Using our adaptive learning rate leads to much faster convergence compared to conservative learning rates from SEARN (Daum e III et al., 2009). Using smooth feedback ensures stable learning of smooth policies at each iteration. Deterministic policy interpolation performs better than stochastic interpolation used in SEARN.
Researcher Affiliation Collaboration Hoang M. Le HMLE@CALTECH.EDU Andrew Kang AKANG@CALTECH.EDU Yisong Yue YYUE@CALTECH.EDU California Institute of Technology, Pasadena, CA, USA Peter Carr PETER.CARR@DISNEYRESEARCH.COM Disney Research, Pittsburgh, PA, USA
Pseudocode Yes Algorithm 1 SIMILE (Smooth IMItation LEarning)
Open Source Code Yes Access data at http://www.disneyresearch.com/ publication/smooth-imitation-learning/ and code at http://github.com/hoangminhle/SIMILE.
Open Datasets Yes Access data at http://www.disneyresearch.com/ publication/smooth-imitation-learning/ and code at http://github.com/hoangminhle/SIMILE.
Dataset Splits No The paper mentions using a 'new dataset Dn tpst, pan t qu' for training and refers to 'training data' but does not specify explicit train/validation/test dataset splits, percentages, or sample counts for reproducibility.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as CPU or GPU models, memory, or cloud instance specifications.
Software Dependencies No The paper mentions 'regression tree ensembles F' and references related work like 'Decision forests', but does not list specific software components with their version numbers (e.g., Python 3.x, TensorFlow x.x, PyTorch x.x) necessary for replication.
Experiment Setup No While the paper describes an 'adaptive learning rate', it does not provide concrete hyperparameter values such as initial learning rate, batch size, number of epochs, or optimizer settings, nor a dedicated section on experimental setup details.