IFR-Explore: Learning Inter-object Functional Relationships in 3D Indoor Scenes
Authors: QI LI, Kaichun Mo, Yanchao Yang, Hang Zhao, Leonidas Guibas
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform extensive experiments that prove the effectiveness of our proposed method. Results show that our model successfully learns priors and fast-interactive-adaptation strategies for exploring inter-object functional relationships in complex 3D scenes. Several ablation studies further validate the usefulness of each proposed module. |
| Researcher Affiliation | Academia | Qi Li1,*, Kaichun Mo2,*, Yanchao Yang2, Hang Zhao1, Leonidas Guibas2 1Tsinghua University 2Stanford University {liqi17thu, zhaohang0124}@gmail.com {kaichun, yanchaoy, guibas}@cs.stanford.edu |
| Pseudocode | No | The paper describes the technical approach and network architectures using prose and diagrams, but does not include any formal pseudocode or algorithm blocks. |
| Open Source Code | No | Reproducibility Statement. We support open-sourced research and will make sure that our results are reproducible. Thus, we promise to release the code, data, and pre-trained models publicly to the community upon paper acceptance. |
| Open Datasets | Yes | We create a new hybrid dataset based on AI2THOR (Kolve et al., 2017) and Part Net-Mobility (Mo et al., 2019b; Xiang et al., 2020) to support our study. |
| Dataset Splits | No | We split the dataset into non-overlapping 800 training scenes and 400 test scenes. The paper explicitly mentions training and test sets but does not specify a separate validation split for model tuning. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions several software components like Proximal Policy Optimization (PPO), Point Net++, Graph Convolutional Network (GCN), and Open AI Gym API, but it does not specify the version numbers for any of these dependencies. |
| Experiment Setup | Yes | We use α = 2, β = 1 and γ = 1 in our experiments. We stop the test-time fast-interactive-adaptation procedure when all predictions are certain enough, min r(t) i,j ,1 r(t) i,j < γ, i, j (3) with a threshold γ = 0.05. We formulate the task as a reinforcement learning (RL) problem and learn an exploration policy that collects data for supervising the prior and posterior networks. We use the standard binary cross entropy for all loss terms. The RL policy, implemented under the Proximal Policy Optimization (PPO) (Schulman et al., 2017) framework, takes the current functional scene graph representation as the state input state(t) = (S,R(t) S ) at each timestep t and picks one object as the action output to interact with action(t) = Oi S. |