Deep Collective Inference

Authors: John Moore, Jennifer Neville

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare to several alternative methods on seven network datasets. DCI achieves up to a 12% reduction in error compared to the best alternative and a 25% reduction in error on average over all methods, for all label proportions. and Empirical Evaluation Data We employ seven datasets in our evaluations.
Researcher Affiliation Academia John Moore, Jennifer Neville Departments of Computer Science and Statistics Purdue University, West Lafayette, IN Email: moore269, neville@purdue.edu
Pseudocode Yes Algorithm 1 RNNTrain(Xt, Xv, Y, max Itr, perform Swap) Algorithm 2 Swap Aug Method(XL, YL) Algorithm 3 DCI Apply(G, F, Y, VU, VLT, VLV1 , VLV2 , max Itr, tol, perform Swap) Algorithm 4 DCI(G, F, Y, VU, val1%, val2%, max Itr, tol)
Open Source Code No The paper does not explicitly state that its source code is open-source or provide a link to it.
Open Datasets Yes The Facebook dataset is a snapshot of the Purdue University Facebook network (Pfeiffer III, Neville, and Bennett 2015). The Internet Movie Database (IMDB) is a movie dataset... (Pfeiffer III, Neville, and Bennett 2015). Amazon DVD 20000 is a subset of the Amazon copurchase data gathered by (Leskovec, Adamic, and Huberman 2007). The Patents citation network (Pfeiffer III, Neville, and Bennett 2015)...
Dataset Splits Yes For DCI, 12% and 3% of nodes of the whole dataset is used for VLV1 and VLV2, respectively, which accounts for a total of 15% for validation. Thus when the training proportion is 0.2, DCI uses 5% for training, 15% for validation, and 80% for testing.
Hardware Specification No The paper mentions: 'All evaluations are performed using three computer clusters with 20 Xeon cores each and memory ranging from 64gb-256gb ram.' However, it does not specify exact CPU or GPU models.
Software Dependencies No The paper states: 'For our implementation, we use Theano under the library known as Blocks (van Merri enboer et al. 2015).' It does not provide specific version numbers for these software components.
Experiment Setup Yes We apply Batched Gradient Descent where batch size = 100. The maximum number of epochs for any network is 200. Early Stopping is used to check if performance on the validation set does not improve in the last 10 epochs. If no improvement, training stops early, and the model performing best on the validation set is chosen. DCI was run for 100 collective iterations with the same early stopping criterion. We used w = 10 (number of hidden nodes).