reproducibilityindex.ai

Long Horizon Temperature Scaling

Authors: Andy Shih, Dorsa Sadigh, Stefano Ermon

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experiment with LHTS on image diffusion models and character/language autoregressive models, demonstrating advantages over myopic temperature scaling in likelihood and sample quality, and showing improvements in accuracy on a multiple choice analogy task by 10%.
Researcher Affiliation	Academia	1 Department of Computer Science, Stanford University.
Pseudocode	Yes	Algorithm 1: LHTS Finetuning
Open Source Code	Yes	Our code is available at https://github.com/Andy Shih12/ Long Horizon Temperature Scaling.
Open Datasets	Yes	CIFAR-10 (Krizhevsky et al., 2009), Text8 dataset (Mahoney, 2011), finetune on the Open Web Text (Gokaslan & Cohen, 2019) corpus.
Dataset Splits	No	The information is not sufficient. The paper mentions using specific datasets but does not provide explicit details about training, validation, and test splits (e.g., percentages, sample counts, or explicit references to predefined splits).
Hardware Specification	No	The information is not sufficient. The paper details model architectures (e.g., DDPM, Transformer, GPT-2) and training parameters, but it does not specify the hardware used for running experiments (e.g., specific GPU or CPU models, memory, or cloud instances).
Software Dependencies	No	The information is not sufficient. The paper does not provide specific version numbers for software dependencies or libraries used (e.g., PyTorch, TensorFlow, Python version).
Experiment Setup	Yes	A. Experimental Settings: Diffusion Model: Learning Rate: 2e-4, Batch Size: 128, EMA decay: 0.9999, Grad Clip: 1, Steps: 50000, Warmup Steps: 5000, LHTS Clip: 0.5. Character Model: Learning Rate: 5e-4, Batch Size: 512, Weight Decay: 0.001, Grad Clip: 0.25, Epochs: 200, LHTS Clip: 3, LHTS Suffix Horizon: 25. Language Model: Learning Rate: 1e-4, Batch Size: 512, Weight Decay: 0.01, Grad Clip: 0.25, Steps: 1000, LHTS KL beta: 0.05, LHTS Clip: 3, LHTS Suffix Horizon: 8.