Metric-Based Auto-Instructor for Learning Mixed Data Representation

Authors: Songlei Jian, Liang Hu, Longbing Cao, Kai Lu

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental By feeding the learned representation into both partition-based and density-based clustering methods, our experiments on eight UCI datasets show highly significant learning performance improvement and much more distinguishable visualization outcomes over the baseline methods.
Researcher Affiliation Academia Songlei Jian, Liang Hu, Longbing Cao, Kai Lu College of Computer, National University of Defense Technology, China Advanced Analytics Institute, University of Technology Sydney, Australia jiansonglei@163.com, rainmilk@gmail.com, longbing.cao@uts.edu.au, kailu@nudt.edu.cn
Pseudocode Yes Algorithm 1 A SGD-based learning algorithm for MAI
Open Source Code Yes 1The MATLAB implementation of Algorithm 1 is available at https://github.com/jiansonglei/MAI
Open Datasets Yes We use eight real-world UCI datasets (Lichman 2013) from different domains for the experiments: Echocardiogram (Echo), Hepatitis, Auto MPG Dataset (MPG), Statlog Heart (Heart), Australian Credit Approval (ACA), Credit Approval (CRX), Contraceptive Method Choice (CMC), and Census Income (Income).
Dataset Splits No The AMI results of k-means on each dataset are the average over 20 validations of clustering with distinct starting points due to the instability of k-means clustering. This refers to multiple runs of clustering, not a dedicated validation dataset split used for hyperparameter tuning.
Hardware Specification No Considering the computational power of GPUs on matrix operation, we use a GPU-based adaptive stochastic gradient descent (SGD) optimizer over mini-batches to speed up the training, thus MAI can be applied on large data. We use Adam in our implementation1. The paper only mentions "GPU" generally, without specifying any particular GPU model, CPU, or other hardware details.
Software Dependencies No We use Adam in our implementation. The paper mentions the optimizer Adam, but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes The length of the representation layer in MAI and the hidden layer of Autoencoder is set to 200. The maximum number of iterations is 30 and the batchsize is 200 in MAI. The dimensionalities of M1 and M2 are set to 60 in MAI.