AutoEncoder by Forest
Authors: Ji Feng, Zhi-Hua Zhou
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that, compared with DNN based auto-encoders, e Forest is able to obtain lower reconstruction error with fast training speed, while the model itself is reusable and damage-tolerable. ... Experiments Image Reconstruction We evaluate the performance of e Forest in both supervised and unsupervised setting. |
| Researcher Affiliation | Academia | Ji Feng, Zhi-Hua Zhou National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 210023, China {fengj, zhouzh}@lamda.nju.edu.cn |
| Pseudocode | Yes | Algorithm 1: Forward Encoding ... Algorithm 2: Calculate MCR ... Algorithm 3: Backward Decoding |
| Open Source Code | No | The paper refers to existing implementations and documentation (e.g., Keras) for comparison models but does not provide any link or explicit statement about releasing the source code for their proposed e Forest model. |
| Open Datasets | Yes | We use the MNIST dataset (Le Cun et al. 1998), which consists of 60,000 gray scale 28 28 images for training and 10,000 for testing. We also use CIFAR-10 dataset (Krizhevsky 2009), which is a more complex dataset consists of 50,000 colored 32 32 images for training and 10,000 colored images for testing. ... Concretely, we used the IMDB dataset (Maas et al. 2011) which contains 25,000 documents for training and 25,000 documents for testing. |
| Dataset Splits | No | The paper specifies training and testing set sizes for MNIST, CIFAR-10, and IMDB datasets, but does not explicitly mention or quantify a separate validation set split. |
| Hardware Specification | Yes | We implement e Forest on a single Intel KNL-7250 (3 TFLOPS peak), and achieved a 67.7 speedup for training 1,000 trees in an unsupervised setting, compared with a serial implementation. For a comparison, we trained the corresponding MLPs and CNN-AEs and SWWAEs with the same configurations as in the previous sections on one Titan-X GPU (6 TFLOPS peak) |
| Software Dependencies | No | The paper mentions Keras documentation for the CNN-AE implementation but does not provide specific version numbers for Keras, Python, or any other software libraries used. |
| Experiment Setup | Yes | Concretely, for supervised e Forest, each non-terminal node randomly select d attributes in the input space and pick the best possible split for information gain; for un-supervised e Forest, each non-terminal node randomly pick one attributes and make a random split. In our experiments we simply grow the trees to pure leaf, or terminate when there are only two instances in a node. We evaluate e Forest containing 500 trees or 1,000 trees... For a vanilla CNN-AE... Re LUs are used for activation and logloss is used as training objective. During training, dropout is set to be 0.25 per layer. |