reproducibilityindex.ai

End-to-End RGB-D Image Compression via Exploiting Channel-Modality Redundancy

Authors: Huiming Zheng, Wei Gao

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results demonstrate our method outperforms existing image compression methods in two RGB-D image datasets.
Researcher Affiliation	Academia	School of Electronic and Computer Engineering, Shenzhen Graduate School, Peking University, Shenzhen, China 2 Peng Cheng Laboratory, China
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions 'CAAI-Mind Spore Open Fund, developed on Open I Community', but does not explicitly state that the source code for their proposed method is released, nor does it provide a direct link to it.
Open Datasets	Yes	SUN-RGBD The SUN-RGBD dataset (Song, Lichtenberg, and Xiao 2015) is a widely used computer vision research dataset for indoor scene understanding and depth perception tasks. NYU-Depth V2 NYU-Depth V2 dataset (Chodosh, Wang, and Lucey 2019) comprises video sequences capturing diverse indoor scenes recorded by the RGB and depth cameras of Microsoft Kinect.
Dataset Splits	Yes	For training, 8,000 image pairs were randomly selected, while 1,000 image pairs were chosen for validation, and an additional 1,000 image pairs were reserved for testing. We divide the entire dataset into three parts, 1,159 image pairs for training, 145 image pairs for validating, and 145 image pairs for testing.
Hardware Specification	Yes	It costs about ten days for the training stage according to Tesla V100.
Software Dependencies	No	The paper mentions 'CUDA-enabled Py Torch implementation' but does not provide specific version numbers for PyTorch, CUDA, or other software dependencies.
Experiment Setup	Yes	We set different values for the hyperparameter λ to control the bit rate. Adam optimizer (Kingma and Ba 2014) is adopted in the training process. We initialize the learning rate to 1e 4. It gradually decreases with the update of the model during training and eventually falls to 1e 5. The batch size is set to 4. We train about 1000 epochs for each model. The input training data is trimmed to the size of 256 256 convenient for model inference.