reproducibilityindex.ai

Learning from the Tangram to Solve Mini Visual Tasks

Authors: Yizhou Zhao, Liang Qiu, Pan Lu, Feng Shi, Tian Han, Song-Chun Zhu3490-3498

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our proposed method generates intelligent solutions for aesthetic tasks such as folding clothes and evaluating room layouts.
Researcher Affiliation	Academia	Yizhou Zhao1, Liang Qiu1, Pan Lu1, Feng Shi1, Tian Han2, Song-Chun Zhu1 1UCLA Center for Vision, Cognition, Learning, and Autonomy 2 Stevens Institute of Technology yizhouzhao@g.ucla.edu
Pseudocode	No	The paper describes methods and formulas but does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The Tangram dataset is available at https://github.com/yizhouzhao/Tangram. The paper explicitly states the link is for the dataset, not the code for the methodology.
Open Datasets	Yes	We introduce the Tangram, a new dataset consisting of more than 10, 000 snapshots... The Tangram dataset is available at https://github.com/yizhouzhao/Tangram. We also use Omniglot (Lake, Salakhutdinov, and Tenenbaum 2019), Multi-digit MNIST (Chen et al. 2018), Icons-50 (Hendrycks and Dietterich 2018), Flowers-17 and Flowers-102 (Nilsback and Zisserman 2008).
Dataset Splits	Yes	For each dataset, 80% of the samples are used for training and the remaining 20% for testing.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running its experiments.
Software Dependencies	No	The paper mentions software like 'Unity game engine' and 'Glo Ve embedding' but does not provide specific version numbers for these or any other key software dependencies.
Experiment Setup	Yes	To train the functions fθ and gφ, we use a simple convolutional neural network with only four 3 3 convolutional layers. Each image is resized into 28 28. We apply the 50-dimension Glo Ve embedding... and we assign 80% of the weight on CCL and 20% on PML. For folding clothes, The size of the image I representing the state s is 28 28 and there are ten vertical and ten horizontal folding axes evenly distributed in the image. For icon classification, The inputs of the network are binary images of the size 224 224.