MambaSCI: Efficient Mamba-UNet for Quad-Bayer Patterned Video Snapshot Compressive Imaging
Authors: Zhenghao Pan, Haijin Zeng, Jiezhang Cao, Yongyong Chen, Kai Zhang, Yong Xu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate that Mamba SCI surpasses state-of-the-art methods with lower computational and memory costs. In this section, we evaluate Mamba SCI against SOTA video reconstruction methods on multiple simulation datasets using PSNR, SSIM metrics, and visual comparisons. |
| Researcher Affiliation | Academia | 1 Harbin Institute of Technology (Shenzhen), 2 Ghent University, 3 Harvard University, 4 Nanjing University |
| Pseudocode | Yes | Py Torch style pseudo-code for the core modules is provided in the supplementary materials. In Algorithm 1, we show the Py Torch style pseudo-code on how to construct a Residual-Mamba-Block. |
| Open Source Code | Yes | Code is at https://github.com/PAN083/Mamba SCI. |
| Open Datasets | Yes | Following STFormer and Efficient SCI, we use DAVIS2017 [52] with resolution 480 894 (480p) as the model training dataset. |
| Dataset Splits | No | The paper describes training epochs and learning rates but does not explicitly mention train/validation/test dataset splits or provide details on a validation set. |
| Hardware Specification | Yes | We use Py Torch framework training on 4 NVIDIA RTX4090 GPUs and use random flipping, scaling, and cropping on DAVIS2017 for data augmentation. |
| Software Dependencies | No | The paper mentions using the 'Py Torch framework' but does not specify its version number or versions of any other key software libraries or dependencies. |
| Experiment Setup | Yes | We use random flipping, scaling, and cropping on DAVIS2017 for data augmentation. We use randomly generated masks as training input to enhance model robustness and optimize the model using the Adam [53] optimizer. Since Mamba SCI is flexible in input size, we first train for 100 epochs at a learning rate of 0.0005 on data with a spatial size of 128 128. Then, we train for 50 epochs at learning rate of 0.0001, followed by fine-tuning on 256 256 data at learning rate of 0.00001 for an additional 50 epochs. |