Deep Collective Inference
Authors: John Moore, Jennifer Neville
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare to several alternative methods on seven network datasets. DCI achieves up to a 12% reduction in error compared to the best alternative and a 25% reduction in error on average over all methods, for all label proportions. and Empirical Evaluation Data We employ seven datasets in our evaluations. |
| Researcher Affiliation | Academia | John Moore, Jennifer Neville Departments of Computer Science and Statistics Purdue University, West Lafayette, IN Email: moore269, neville@purdue.edu |
| Pseudocode | Yes | Algorithm 1 RNNTrain(Xt, Xv, Y, max Itr, perform Swap) Algorithm 2 Swap Aug Method(XL, YL) Algorithm 3 DCI Apply(G, F, Y, VU, VLT, VLV1 , VLV2 , max Itr, tol, perform Swap) Algorithm 4 DCI(G, F, Y, VU, val1%, val2%, max Itr, tol) |
| Open Source Code | No | The paper does not explicitly state that its source code is open-source or provide a link to it. |
| Open Datasets | Yes | The Facebook dataset is a snapshot of the Purdue University Facebook network (Pfeiffer III, Neville, and Bennett 2015). The Internet Movie Database (IMDB) is a movie dataset... (Pfeiffer III, Neville, and Bennett 2015). Amazon DVD 20000 is a subset of the Amazon copurchase data gathered by (Leskovec, Adamic, and Huberman 2007). The Patents citation network (Pfeiffer III, Neville, and Bennett 2015)... |
| Dataset Splits | Yes | For DCI, 12% and 3% of nodes of the whole dataset is used for VLV1 and VLV2, respectively, which accounts for a total of 15% for validation. Thus when the training proportion is 0.2, DCI uses 5% for training, 15% for validation, and 80% for testing. |
| Hardware Specification | No | The paper mentions: 'All evaluations are performed using three computer clusters with 20 Xeon cores each and memory ranging from 64gb-256gb ram.' However, it does not specify exact CPU or GPU models. |
| Software Dependencies | No | The paper states: 'For our implementation, we use Theano under the library known as Blocks (van Merri enboer et al. 2015).' It does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | We apply Batched Gradient Descent where batch size = 100. The maximum number of epochs for any network is 200. Early Stopping is used to check if performance on the validation set does not improve in the last 10 epochs. If no improvement, training stops early, and the model performing best on the validation set is chosen. DCI was run for 100 collective iterations with the same early stopping criterion. We used w = 10 (number of hidden nodes). |