Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
GRAIL: Graph Edit Distance and Node Alignment using LLM-Generated Code
Authors: Samidha Verma, Arushi Goyal, Ananya Mathur, Ankit Anand, Sayan Ranu
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on seven datasets confirm that GRAIL not only surpasses state-of-the-art GED approximation methods in prediction quality but also achieves robust cross-domain generalization across diverse graph distributions. |
| Researcher Affiliation | Collaboration | 1Yardi School of Artificial Intelligence, IIT Delhi, India 2Department of Computer Science and Engineering, IIT Delhi, India 3Google Deep Mind, Montreal, Canada. Correspondence to: Samidha Verma < EMAIL>, Arushi Goyal <EMAIL>, Ananya Mathur <EMAIL>, Ankit Anand <EMAIL>, Sayan Ranu <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 The greedy approach Require: Train data T = {T1, , Tn} where Tt = Gt, G t is a pair of graphs, budget b. Ensure: solution set Agreedy, |Agreedy| = b 1: Agreedy 2: while size(Agreedy) b (within budget) do 3: P arg max P D\Agreedy {J (Agreedy) J (Agreedy {P})} 4: Agreedy Agreedy {P } 5: Return Agreedy |
| Open Source Code | Yes | The codebase of GRAIL and the programs generated for the various datasets are available at https://github.com/idea-iitd/Grail. |
| Open Datasets | Yes | While AIDS, Linux and IMDB are obtained from Morris et al. (2020), the other four datasets are made available by Hu et al. (2021). |
| Dataset Splits | Yes | To construct the test set for a particular dataset, we select 1000 graph pairs uniformly at random and compute their true GED. ... Neural Algorithms: All neural approaches are trained on 10,000 graph pairs per dataset. ... GRAIL and GRAIL-MIX: GRAIL is trained with only 1,000 graph pairs per dataset. |
| Hardware Specification | Yes | All experiments ran on a machine equipped with an Intel Xeon Gold 6142 CPU @1GHz and a Ge Force GTX 1080 Ti GPU. |
| Software Dependencies | No | For the LLM, we use Gemini 1.5 Pro. In particular, we have used the initial stable version of Gemini 1.5 Pro, i.e., gemini-1.5-pro-001, which was released on May 24, 2024. No other software dependencies with specific version numbers are provided for the implementation itself. |
| Experiment Setup | Yes | Hyper-parameters: Table H lists the hyper-parameters used for GRAIL. k stands for the number of functions per response generated by the LLM and b is the function budget employed for submodularity while training. ... Table H. Hyper-parameters used for GRAIL: k 2, b 15, number of islands 5, temperature 0.99, Algorithm for bipartite matching Neighbor-biased mapper (He & Singh, 2006) |