reproducibilityindex.ai

AutoEncoder by Forest

Authors: Ji Feng, Zhi-Hua Zhou

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that, compared with DNN based auto-encoders, e Forest is able to obtain lower reconstruction error with fast training speed, while the model itself is reusable and damage-tolerable. ... Experiments Image Reconstruction We evaluate the performance of e Forest in both supervised and unsupervised setting.
Researcher Affiliation	Academia	Ji Feng, Zhi-Hua Zhou National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 210023, China {fengj, zhouzh}@lamda.nju.edu.cn
Pseudocode	Yes	Algorithm 1: Forward Encoding ... Algorithm 2: Calculate MCR ... Algorithm 3: Backward Decoding
Open Source Code	No	The paper refers to existing implementations and documentation (e.g., Keras) for comparison models but does not provide any link or explicit statement about releasing the source code for their proposed e Forest model.
Open Datasets	Yes	We use the MNIST dataset (Le Cun et al. 1998), which consists of 60,000 gray scale 28 28 images for training and 10,000 for testing. We also use CIFAR-10 dataset (Krizhevsky 2009), which is a more complex dataset consists of 50,000 colored 32 32 images for training and 10,000 colored images for testing. ... Concretely, we used the IMDB dataset (Maas et al. 2011) which contains 25,000 documents for training and 25,000 documents for testing.
Dataset Splits	No	The paper specifies training and testing set sizes for MNIST, CIFAR-10, and IMDB datasets, but does not explicitly mention or quantify a separate validation set split.
Hardware Specification	Yes	We implement e Forest on a single Intel KNL-7250 (3 TFLOPS peak), and achieved a 67.7 speedup for training 1,000 trees in an unsupervised setting, compared with a serial implementation. For a comparison, we trained the corresponding MLPs and CNN-AEs and SWWAEs with the same conﬁgurations as in the previous sections on one Titan-X GPU (6 TFLOPS peak)
Software Dependencies	No	The paper mentions Keras documentation for the CNN-AE implementation but does not provide specific version numbers for Keras, Python, or any other software libraries used.
Experiment Setup	Yes	Concretely, for supervised e Forest, each non-terminal node randomly select d attributes in the input space and pick the best possible split for information gain; for un-supervised e Forest, each non-terminal node randomly pick one attributes and make a random split. In our experiments we simply grow the trees to pure leaf, or terminate when there are only two instances in a node. We evaluate e Forest containing 500 trees or 1,000 trees... For a vanilla CNN-AE... Re LUs are used for activation and logloss is used as training objective. During training, dropout is set to be 0.25 per layer.