On the choice of Perception Loss Function for Learned Video Compression
Authors: Sadaf Salehkalaibar, Truong Buu Phan, Jun Chen, Wei Yu, Ashish Khisti
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using information theoretic analysis and deep-learning based experiments, we demonstrate that the choice of PLF can have a significant effect on the reconstruction, especially at low-bit rates. ... We validate our results using (one-shot) information-theoretic analysis, detailed study of the rate-distortion-perception tradeoff of the Gauss-Markov source model as well as deep-learning based experiments on moving MNIST and KTH datasets. |
| Researcher Affiliation | Academia | Sadaf Salehkalaibar ECE Department University of Toronto sadafs@ece.utoronto.ca Buu Phan* ECE Department University of Toronto truong.phan@mail.utoronto.ca Jun Chen ECE Department Mc Master University chenjun@mcmaster.ca Wei Yu ECE Department University of Toronto weiyu@ece.utoronto.ca Ashish Khisti ECE Department University of Toronto akhisti@ece.utoronto.ca |
| Pseudocode | No | The paper does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | Code will be available at https://github.com/truongbuu/URDP_flow. |
| Open Datasets | Yes | We validate our results using ... deep-learning based experiments on moving MNIST and KTH datasets. ... Moving MNIST dataset [29] (with 1 digit) using Wasserstein GAN [30] ... Additional results on the KTH dataset [31] are available in Appendix J.3. |
| Dataset Splits | No | The paper mentions 'training set contains 60000 images' but does not provide specific train/validation/test splits or a clear splitting methodology. |
| Hardware Specification | Yes | Training takes 2 days per model on a single NVIDIA P100 GPU. |
| Software Dependencies | No | The paper mentions software like 'Wasserstein GAN', 'scale-space flow model', and 'conditional module', and 'WGAN-GP framework', but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We use a batch size of 64, RMSProp optimizer with a learning rate of 5 10 5, and train each model with 360 epochs, where the training set contains 60000 images. ... Under WGAN-GP framework [30], we use the gradient penalty of 10 and update the encoders/decoders for every 5 iterations. The parameters λ controlling the tradeoff are in Table.7. |