reproducibilityindex.ai

Subject-driven Text-to-Image Generation via Apprenticeship Learning

Authors: Wenhu Chen, Hexiang Hu, Yandong Li, Nataniel Ruiz, Xuhui Jia, Ming-Wei Chang, William W. Cohen

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform a comprehensive set of automatic and human evaluations to show the capability of our model on generating highly faithful and creative images on Dream Bench and Dream Bench-v2.
Researcher Affiliation	Industry	Google Deepmind Google Research {wenhuchen,hexiang,mingweichang,wcohen}@google.com
Pseudocode	Yes	Algorithm 1 Apprenticeship Learning from a Large Crowd of Specialized Expert Models
Open Source Code	No	To facilitate the reproducibility of our model performance, we release the Su TI model API as a Google Cloud Vertex AI model service, under the production name Instant tuning 3. Generally available at https://cloud.google.com/vertex-ai/docs/generative-ai/image/ﬁne-tune-model. This provides access to the model via an API, but not to its source code in a repository.
Open Datasets	Yes	We construct the seed dataset using the Web LI [10, 20] dataset.
Dataset Splits	No	The paper constructs a training dataset 'G' from expert models but does not specify validation splits for this dataset used during the training of the apprentice model.
Hardware Specification	Yes	We tune each model on a single TPU core (32 GB)... The apprentice training is performed on 128 Cloud TPU v4 chips.
Software Dependencies	No	The paper mentions optimizers (Adafactor) and models (CLIP Vi T-L14, Imagen checkpoint) but does not provide specific version numbers for software libraries or environments like Python, PyTorch, or TensorFlow.
Experiment Setup	Yes	We tune each model on a single TPU core (32 GB) for 500 steps using Adafactor optimizer with a learning rate of 1e-5... We train the model for a total of 150K steps. We use an Adafactor optimizer with a learning rate of 1e-4. We use 3 demonstrations during training... We use a lower classiﬁer-free guidance weight of 15 with DDPM [14] sampling strategy.