Interactive Object Placement with Reinforcement Learning

Authors: Shengping Zhang, Quanling Meng, Qinglin Liu, Liqiang Nie, Bineng Zhong, Xiaopeng Fan, Rongrong Ji

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on the OPA dataset demonstrate that the proposed method achieves state-of-the-art performance in terms of plausibility and diversity.
Researcher Affiliation Academia 1School of Computer Science and Technology, Harbin Institute of Technology, Weihai, China 2School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China 3Guangxi Key Laboratory of Multi-Source Information Mining and Security, Guangxi Normal University, Guilin, China 4School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China 5Department of Artificial Intelligence, School of Informatics, Xiamen University, Xiamen, China.
Pseudocode Yes Algorithm 1 Training procedure of IOPRE
Open Source Code No The paper does not include a statement about releasing the source code or a link to a repository for the described methodology.
Open Datasets Yes We conduct all experiments on the Object Placement Assessment (OPA) dataset (Liu et al., 2021a)
Dataset Splits No Specifically, these composite images are divided into 62,074 training images and 11,396 test images. The paper explicitly provides train and test splits, but no separate validation split.
Hardware Specification No No specific hardware (GPU, CPU models, etc.) used for running experiments is mentioned in the paper.
Software Dependencies No The paper mentions software components like Swin Transformer and Adam W, but no specific version numbers for any software dependencies are provided.
Experiment Setup Yes We set the learning rate as 2e-4 and use Adam W (Loshchilov & Hutter, 2017) to train IOPRE with batch size 64 for 15 epochs. For the assessment reward, we set the hyper-parameter λs as 0.01. To train IOPRE, we set the discount factor γ as 0.99, the weight of entropy loss β as 0.08, and the maximum number of steps tmax as 20. In the training phase, t0 is obtained by a random initialization. In the test phase, the maximum number of steps is set as 100. All images are resized to 256 256 before being fed into all networks.