FunkNN: Neural Interpolation for Functional Generation

Authors: AmirEhsan Khorashadizadeh, Anadi Chaman, Valentin Debarnot, Ivan Dokmanić

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally show that Funk NN reliably learns to resolve images to resolutions much higher than those seen during training. In fact, it performs comparably to state-of-the-art continuous superresolution networks (Chen et al., 2021) despite having only a fraction of the latter s trainable parameters. Unlike traditional learning-based methods, our approach can also super-resolve images that belong to image distributions different from those seen while training. This is a benefit of patchbased processing which reduces the chance of overfitting on global image features. In addition, we show that our overall generative model framework can produce high quality image samples at any resolution. With the continuous, differentiable map between spatial coordinates and image intensities, we can access spatial image derivatives at arbitrary coordinates and use them to solve inverse problems.
Researcher Affiliation Academia Department of Mathematics and Computer Science, University of Basel Coordinated Science Laboratory, University of Illinois at Urbana-Champaign
Pseudocode No The paper does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Our implementation is available at https://github.com/swing-research/Funk NN.
Open Datasets Yes We work on the Celeb A-HQ (Karras et al., 2017) dataset with resolution 128 128, and we train the model using three training strategies explained in section 3.1.1, single, continuous and factor. We train AE model over 40000 images of the Lo Do Pa B-CT (Leuschner et al., 2021) and 30000 images of Celeb A-HQ (Karras et al., 2017) datasets in resolution 128 128.
Dataset Splits No The paper mentions training on a certain number of images and using 'test samples' but does not specify the explicit percentages or sample counts for training, validation, or test splits needed for reproducibility. For example, it states '40000 images of the Lo Do Pa B-CT' for training but no split percentages.
Hardware Specification Yes It is worth mentioning that on the same Nvidia A100 GPU, the average time to process 100 images of size 128 128 is 1.97 seconds for LIIF, 0.81 seconds for LIIF with 140k parameters and 0.7 seconds for Funk NN.
Software Dependencies No The paper mentions PyTorch and TensorFlow as frameworks, and Adam as an optimizer, but it does not specify version numbers for these software components or any other libraries.
Experiment Setup Yes The low-resolution images are composed of d d = 128 128 pixels and the high-resolutions target images are of size n n = 256 256. We use z = 0 as the initialization as suggested by (Whang et al., 2021) and optimize for 2500 iterations using Adam optimizer over Problem (3) with λ = 0 and on Problem (4) with λ = 0 and λ2 = 10 2. Then, we run 1000 iterations of stochastic gradient descent to optimize the weights of the auto-encoder of the generative model. All the models are trained using Adam optimizer (Kingma & Ba, 2014) with learning rate 10 4.