Unsupervised Discovery of Parts, Structure, and Dynamics
Authors: Zhenjia Xu*, Zhijian Liu*, Chen Sun, Kevin Murphy, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on multiple real and synthetic datasets demonstrate that our PSD model works well on all three tasks: segmenting object parts, building their hierarchical structure, and capturing their motion distributions. |
| Researcher Affiliation | Collaboration | Zhenjia Xu MIT CSAIL, Shanghai Jiao Tong University Zhijian Liu MIT CSAIL Chen Sun Google Research Kevin Murphy Google Research William T. Freeman MIT CSAIL, Google Research Joshua B. Tenenbaum MIT CSAIL Jiajun Wu MIT CSAIL |
| Pseudocode | Yes | Algorithm 1 Training PSD and Algorithm 2 Evaluating PSD provide structured pseudocode blocks. |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | For each dataset, we rendered totally 100,000 pairs for training and 10,000 for testing, with random visual appearance (i.e., sizes, positions, and colors). As for the digits dataset, we use six types of hand-written digits from MNIST (Le Cun et al., 1998). |
| Dataset Splits | No | The paper explicitly states 'training' and 'testing' dataset sizes but does not mention specific 'validation' splits with percentages or counts for reproduction. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU/CPU models or memory. |
| Software Dependencies | No | The paper mentions 'Py Torch (Paszke et al., 2017)' and 'an off-the-shelf package (Liu, 2009)' for optical flow, but it does not provide specific version numbers for these or any other key software dependencies required for reproducibility. |
| Experiment Setup | Yes | Optimization is carried out using ADAM (Kingma & Ba, 2015) with β1 = 0.9 and β2 = 0.999. We use a fixed learning rate of 10 3 and mini-batch size of 32. Our motion encoder takes the flow field ˆ M between two consecutive frames as input, with resolution of 128 128. It applies seven convolutional layers with number of channels {16, 16, 32, 32, 64, 64, 64}, kernel sizes 5 5, and stride sizes 2 2. |