reproducibilityindex.ai

Hierarchical Autoregressive Modeling for Neural Video Compression

Authors: Ruihan Yang, Yibo Yang, Joseph Marino, Stephan Mandt

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Comprehensive evaluations on large-scale video data show improved rate-distortion performance over both state-of-the-art neural and conventional video compression methods.
Researcher Affiliation	Academia	Department of Computer Science, UC Irvine1 Computation & Neural Systems, California Institute of Technology2
Pseudocode	Yes	pseudocode is available in Appendix A.3. Algorithm 1: An efﬁcient algorithm to build a scale-space 3D tensor
Open Source Code	No	We release You Tube-NT in the form of customizable scripts to facilitate future compression research. (Footnote: 1https://github.com/privateyoung/Youtube-NT). This link is for the dataset generation scripts, not the main model's source code.
Open Datasets	Yes	Vimeo-90k (Xue et al., 2019) consists of 90,000 clips... You Tube-NT. This is our new dataset. We collected 8,000 nature videos and movie/video-game trailers from youtube.com and processed them into 300k high-resolution (720p) clips, which we refer to as You Tube-NT. We release You Tube-NT in the form of customizable scripts to facilitate future compression research.
Dataset Splits	No	The paper does not explicitly mention a validation dataset split (e.g., 80/10/10 split or specific counts for validation set).
Hardware Specification	Yes	Training time is about four days on an NVIDIA Titan RTX.
Software Dependencies	No	The paper mentions 'Adam optimizer (Kingma & Ba, 2015)' and 'ffmpeg' commands but does not provide specific version numbers for software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages.
Experiment Setup	Yes	All models are trained on three consecutive frames and batchsize 8, which are randomly selected from each clip, then randomly cropped to 256x256. We trained on MSE loss, following similar procedure to Agustsson et al. (2020) (see Appendix A.2 for details). We use the Adam optimizer (Kingma & Ba, 2015), training the models for 1,050,000 steps. The initial learning rate of 1e-4 is decayed to 1e-5 after 900,000 steps, and we increase the crop size to 384x384 for the last 50,000 steps.