Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Random Search Neural Networks for Efficient and Expressive Graph Learning

Authors: Michael Ito, Danai Koutra, Jenna Wiens

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, RSNNs consistently outperform RWNNs on molecular and protein benchmarks, achieving comparable or superior performance with up to 16 fewer sampled sequences. Our work bridges theoretical and practical advances in random walk based approaches, offering an efficient and expressive framework for learning on sparse graphs.
Researcher Affiliation	Academia	Michael Ito University of Michigan EMAIL Danai Koutra University of Michigan EMAIL Jenna Wiens University of Michigan EMAIL
Pseudocode	Yes	Algorithm 1: Uniform Random Walk with Positional Encodings Input: Graph G = (V, E), walk length l, window size s Output: Random walk W = (w0, . . . , wl), encodings ids W , adjs W Algorithm 2: Random Depth-First Search with Adjacency Encodings Input: Graph G = (V, E), window size s Output: Search sequence W = (w0, . . . , wℓ), adjacency encoding adjs W
Open Source Code	Yes	1Code can be found at: https://github.com/MLD3/Random Search NNs
Open Datasets	Yes	Specifically, we evaluate on four small-scale molecular graph classification datasets from Molecule Net [32]: CLINTOX, SIDER, TOX21, and BBBP. ... We also include four protein graph classification datasets from Protein Shake [33]: EC Subclass, EC Mechanism, SC Class, and SC Family. ... To assess scalability, we evaluate on large-scale molecular benchmarks with hundreds of thousands of graphs from Open Graph Benchmark [34]: PCBA-1030, PCBA-1458, and PCBA-4467.
Dataset Splits	Yes	We report median (min, max) performance over five random splits (60/20/20), which is more robust than mean and standard deviation for small sample sizes.
Hardware Specification	Yes	All models are trained on a machine equipped with 8 NVIDIA Ge Force GTX 1080 Ti GPUs; if a model does not converge within 24 hours, we omit it from evaluation.
Software Dependencies	No	The paper mentions types of sequence models (GRU, LSTM, Transformer) and an optimizer (Adam) but does not provide specific version numbers for any software dependencies like Python, PyTorch, or CUDA.
Experiment Setup	Yes	We perform a grid search over the following hyperparameters for all RWNN and RSNN models: Number of layers: {1, 2, 3} Learning rate: {0.05, 0.01, 0.005, 0.001} Batch size: {32, 64, 128} Hidden dimension: {32, 64, 128} Global pooling: {mean, sum, max} Sequence model: {GRU, LSTM, Transformer} Number of samples m: {1, 4, 8, 16} We fix the window size s = 8 for both CRAWL and RSNN models. All models are optimized using the Adam optimizer [42].