FunkNN: Neural Interpolation for Functional Generation
Authors: AmirEhsan Khorashadizadeh, Anadi Chaman, Valentin Debarnot, Ivan Dokmanić
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally show that Funk NN reliably learns to resolve images to resolutions much higher than those seen during training. In fact, it performs comparably to state-of-the-art continuous superresolution networks (Chen et al., 2021) despite having only a fraction of the latter s trainable parameters. Unlike traditional learning-based methods, our approach can also super-resolve images that belong to image distributions different from those seen while training. This is a benefit of patchbased processing which reduces the chance of overfitting on global image features. In addition, we show that our overall generative model framework can produce high quality image samples at any resolution. With the continuous, differentiable map between spatial coordinates and image intensities, we can access spatial image derivatives at arbitrary coordinates and use them to solve inverse problems. |
| Researcher Affiliation | Academia | Department of Mathematics and Computer Science, University of Basel Coordinated Science Laboratory, University of Illinois at Urbana-Champaign |
| Pseudocode | No | The paper does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Our implementation is available at https://github.com/swing-research/Funk NN. |
| Open Datasets | Yes | We work on the Celeb A-HQ (Karras et al., 2017) dataset with resolution 128 128, and we train the model using three training strategies explained in section 3.1.1, single, continuous and factor. We train AE model over 40000 images of the Lo Do Pa B-CT (Leuschner et al., 2021) and 30000 images of Celeb A-HQ (Karras et al., 2017) datasets in resolution 128 128. |
| Dataset Splits | No | The paper mentions training on a certain number of images and using 'test samples' but does not specify the explicit percentages or sample counts for training, validation, or test splits needed for reproducibility. For example, it states '40000 images of the Lo Do Pa B-CT' for training but no split percentages. |
| Hardware Specification | Yes | It is worth mentioning that on the same Nvidia A100 GPU, the average time to process 100 images of size 128 128 is 1.97 seconds for LIIF, 0.81 seconds for LIIF with 140k parameters and 0.7 seconds for Funk NN. |
| Software Dependencies | No | The paper mentions PyTorch and TensorFlow as frameworks, and Adam as an optimizer, but it does not specify version numbers for these software components or any other libraries. |
| Experiment Setup | Yes | The low-resolution images are composed of d d = 128 128 pixels and the high-resolutions target images are of size n n = 256 256. We use z = 0 as the initialization as suggested by (Whang et al., 2021) and optimize for 2500 iterations using Adam optimizer over Problem (3) with λ = 0 and on Problem (4) with λ = 0 and λ2 = 10 2. Then, we run 1000 iterations of stochastic gradient descent to optimize the weights of the auto-encoder of the generative model. All the models are trained using Adam optimizer (Kingma & Ba, 2014) with learning rate 10 4. |