Norm-guided latent space exploration for text-to-image generation

Authors: Dvir Samuel, Rami Ben-Ari, Nir Darshan, Haggai Maron, Gal Chechik

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate NAO extensively. First, we directly assess the quality of images generated by our methods, showing higher quality and better semantic content. Second, we use our seed space interpolation and centroid finding methods in two tasks: (1) Generating images of rare concepts, and (2) Augmenting semantic data for few-shot classification and long-tail learning. For these tasks, our experiments indicate that seed initialization with our prior-guided approach improves So TA performance and at the same time has a significantly shorter running time (up to X10 faster) compared to other approaches.
Researcher Affiliation Collaboration Dvir Samuel1,2, Rami Ben-Ari2, Nir Darshan2, Haggai Maron3,4, Gal Chechik1,4 1Bar-Ilan University, Ramat-Gan, Israel 2Origin AI, Tel-Aviv, Israel 3Technion, Haifa, Israel 4NVIDIA Research, Tel-Aviv, Israel
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures).
Open Source Code No The paper does not provide an explicit statement about releasing its own source code or a link to a code repository for the methodology described.
Open Datasets Yes We evaluated NAO on three common few-shot classification benchmarks: (1) CUB200 [55]: A fine-grained dataset comprising 11,788 images of 200 bird species. (2) mini Image Net [54]: A modified version of the standard Image Net dataset [14]. It contains a total of 100 classes... (3) CIFAR-FS [8]: Created from CIFAR-100 [26] by using the sampling criteria as mini Image Net. We further evaluated NAO on long-tailed recognition task using the Image Net-LT [32] benchmark.
Dataset Splits Yes CUB200 [55]: The classes are divided into three sets, with 100 for meta-training, and 50 each for meta-validation and meta-testing. mini Image Net [54]: It contains a total of 100 classes, with 64 classes used for meta-training, 16 classes for meta-validation, and 20 classes for meta-testing. CIFAR-FS [8]: Has 64 classes for meta-training, 16 classes for meta-validation, and 20 classes for meta-testing; each class containing 600 images.
Hardware Specification Yes We also report Mean FID score between the generated images and the real images, mean centroid initialization time ˆTInit and mean Seed Select optimization time until convergence TOpt on a single NVIDIA A100 GPU.
Software Dependencies No We implemented a simple optimization algorithm that optimizes the discretized problems using Pytorch and the Adam optimizer. However, the paper does not specify version numbers for PyTorch, Adam optimizer, or any other software dependencies.
Experiment Setup Yes We implemented a simple optimization algorithm that optimizes the discretized problems using Pytorch and the Adam optimizer. To speed up convergence we initialize the optimization variables: the centroid is initialized with the Euclidean centroid and path variables are initialized as the values of the linear path between the points and the centroid. We implement the constraints C(x) = |xi xi 1| δ 0 using a soft penalty term in the optimization, by the form α Re LU(C(x)) where α is a hyper-parameter. For interpolation methods that require path optimization, we used paths with 10 sampled points. We generated 1,000 additional samples for each novel class using Seed Select... We used a Res Net-12 model for performing N-way classification and trained it using cross-entropy loss on both real and synthetic data.