Incremental Topological Ordering and Cycle Detection with Predictions
Authors: Samuel Mccauley, Benjamin Moseley, Aidin Niaparast, Shikha Singh
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Moreover, we demonstrate empirically that predictions, learned from a very small training dataset, are sufficient to provide significant speed-ups on real datasets. Our experiments show that using prediction significantly speeds up performance over baseline solutions on real temporal data. This paper is also the first empirical evaluation of using predictions on dynamic graph data structures. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Williams College, Williamstown, MA 01267 USA 2Tepper School of Business, Carnegie Mellon University, Pittsburgh, PA 15213 USA. Correspondence to: Samuel Mc Cauley <sam@cs.williams.edu>, Benjamin Moseley <moseleyb@andrew.cmu.edu>, Aidin Niaparast <aniapara@andrew.cmu.edu>, Shikha Singh <shikha@cs.williams.edu>. |
| Pseudocode | No | The paper describes algorithms in detailed prose but does not contain structured pseudocode blocks or figures labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | Our implementation and datasets can be found at https://github.com/Aidin Niaparast/ Learned-Topological-Order. |
| Open Datasets | Yes | We use real directed temporal networks from the SNAP Large Network Dataset Collection (Leskovec & Krevl, 2014). To obtain the final DAG G, we randomly permute the vertices and only keep the edges that go from smaller to larger positions (this ensures G is acyclic). |
| Dataset Splits | No | The paper specifies training and test data splits, but does not explicitly mention a separate validation dataset or methodology for its use. |
| Hardware Specification | Yes | We use Python 3.10 on a machine with 11th Gen Intel Core i7 CPU 2.80GHz, 32GB of RAM, 128GB NVMe KIOXIA disk drive, and 64-bit Windows 10 Enterprise to run our experiments. |
| Software Dependencies | No | The paper states 'We use Python 3.10' but does not specify any other software libraries, frameworks, or specific versions of those components that would be necessary for full reproducibility. |
| Experiment Setup | Yes | The last 50% of the data in increasing order of the timestamps is used as the test data in all of the experiments in Table 2. The training data for LDFS is a contiguous subsequence of the data that comes right before the test data. For testing robustness to prediction error, we add noise to the predictions. Finally, we add a normal noise with mean 0 and standard deviation SD(noise) = C SD(predictions) (for some constant C) independently to all of the predictions to obtain our noisy predictions. We repeat the experiment 10 times, each time regenerating the noisy predictions. We use a random permutation of the nodes for the initial levels of the nodes in the DFS II algorithm. For all the experiments on this algorithm, we regenerate this permutation 5 times and report the average of these runs. |