SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow

Authors: Chaoyang Wang, Xiangtai Li, Lu Qi, Henghui Ding, Yunhai Tong, Ming-Hsuan Yang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our Sem Flow achieves competitive results on semantic segmentation and semantic image synthesis tasks.
Researcher Affiliation Academia 1School of Intelligence Science and Technology, Peking University 2UC, Merced 3Institute of Big Data, Fudan University
Pseudocode No The paper describes its methodology using text and mathematical equations but does not include structured pseudocode or algorithm blocks.
Open Source Code No We are not able to provide the code at submission time. We will definitely release the code in the future.
Open Datasets Yes We study Sem Flow using three popular datasets: COCO-Stuff [37], Celeb AMask-HQ [30], and Cityscapes [12].
Dataset Splits No The paper mentions using COCO-Stuff, Celeb AMask-HQ, and Cityscapes datasets and their resizing/cropping, but does not explicitly state the train/validation/test dataset splits or percentages used.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions using Stable Diffusion UNet and pre-trained SD 1.5 weights, but does not provide specific software dependency versions for libraries or frameworks like Python, PyTorch, or CUDA.
Experiment Setup Yes We train Celeb AMask-HQ and Cityscapes with a batch size of 256 with AdamW optimizer for 80K and 8K steps, respectively. The initial learning rate is set as 2 10-5 and 5 10-5. Linear learning rate scheduler is adopted. For COCO-Stuff dataset, we use a constant learning rate of 1 10-5 with a batch size of 128 for 320K steps.