reproducibilityindex.ai

Multi-Plane Program Induction with 3D Box Priors

Authors: Yikai Li, Jiayuan Mao, Xiuming Zhang, Bill Freeman, Josh Tenenbaum, Noah Snavely, Jiajun Wu

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that BPI can efﬁciently and accurately infer the structure and camera parameters for both indoor and outdoor scenes.
Researcher Affiliation	Collaboration	Yikai Li1,2 Jiayuan Mao1 Xiuming Zhang1 William T. Freeman1,3 Joshua B. Tenenbaum1 Noah Snavely3 Jiajun Wu4 1MIT CSAIL 2Shanghai Jiao Tong University 3Google Research 4Stanford University
Pseudocode	No	The paper includes a table describing a Domain-Specific Language (DSL) for box programs, but it does not contain pseudocode or a clearly labeled algorithm block describing the BPI methodology itself.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	No	We collect two datasets from web image search engines for our experiments, a 44-image Corridor Boxes dataset and a 42-image Building Boxes dataset. These correspond to the inner view and the outer view of boxes, respectively. For both datasets, we manually annotate the plane segmentations by specifying edges of the boxes. For corridor images, we also create a mask for the far plane. For building images, we supplement the subject segmentation (i.e., the building of interest) to the dataset annotation.
Dataset Splits	No	The paper does not explicitly provide training, validation, or test dataset splits.
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments with specific details such as GPU or CPU models.
Software Dependencies	No	The paper mentions software tools like Neur VPS and L-CNN, but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	Fixing the camera at the world origin, pointing in the +z direction, we then compute the 3D position and surface normal of each plane. As shown in Fig. 1, because the distance between camera and the corridor is coupled with the focal length of the camera, here we use a ﬁxed focal length of f = 35mm . Following common practice, we also ﬁx other camera intrinsic properties: optical center to (0, 0), skew factor to 0, and pixel aspect ratio to 1. Next, we ﬁlter out wireframe segments whose length is smaller than a threshold δ1 or whose extension does not cross a neighbourhood centered at vp with radius δ2. We add another term to this similarity function: sim(p, q) simpixel + simreg = simpixel λreg wraparound(smap[p] smap[q]) 2 2, where λreg is a hyperparameter that controls the weight of the regularity enforcement.