Rethinking Visual Reconstruction: Experience-Based Content Completion Guided by Visual Cues
Authors: Jiaxuan Chen, Yu Qi, Gang Pan
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments were carried out with a benchmark dataset in comparison with existing approaches. |
| Researcher Affiliation | Academia | 1State Key Lab of Brain-Machine Intelligence, Zhejiang University, Hangzhou, China. 2College of Computer Science and Technology, Zhejiang University, Hangzhou, China. 3MOE Frontier Science Center for Brain Science and Brain-Machine Integration, Zhejiang University, Hangzhou, China. |
| Pseudocode | No | The paper describes the proposed framework and its components using text and equations (e.g., Eq. 1-13) but does not provide pseudocode or a clearly labeled algorithm block. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing the source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | We experimented with a popular publicly available f MRI dataset, which is called Generic Object Decoding (GOD) dataset (Horikawa & Kamitani, 2017). |
| Dataset Splits | No | The paper states: 'For each subject, training set consists 1200 f MRI-image pairs, and the testing made up of 50 f MRI recordings with corresponding images.' It does not explicitly define a separate validation dataset split. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions 'Adam solver (Kingma & Ba, 2014)' but does not specify version numbers for any software dependencies or libraries used in the implementation. |
| Experiment Setup | Yes | The parameter setting of VQ-f MRI for all experiments is summarized as follows. Enocders of VQ-VAE: 2 convolutional layers (stride 2, kernel 4 4, and padding 1), followed by two residual blocks; Deocders of VQ-VAE: two residual blocks, followed by 3 transposed convolutions (stride 2, kernel 4 4, and padding 1); Codebooks: ZL R8 32 (image y R64 64 3), and Z R8 128 (image y R128 128 3). We implemented the image classifier, inpainting, and SR modules using the UNet with 2 downsampling and 2 upsampling layers (stride 2, kernel 4 4, and padding 1). Adam solver (Kingma & Ba, 2014) is employed to optimize the parameters with a learning rate of 2e-4. |