Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning Graph Quantized Tokenizers

Authors: Limei Wang, Kaveh Hassani, Si Zhang, Dongqi Fu, Baichuan Yuan, Weilin Cong, Zhigang Hua, Hao Wu, Ning Yao, Bo Long

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments on both homophilic and heterophilic datasets, including large-scale and long-range benchmarks, we demonstrate that our tokenizer enables Transformer encoders to achieve state-of-the-art performance on 20 out of 22 benchmarks while substantially reducing the memory footprint of the embeddings. 6 EXPERIMENTS We evaluate GQT on both mediumand large-scale graph learning tasks, encompassing 22 homophilic, heterophilic, and long-range benchmarks.
Researcher Affiliation	Academia	The paper does not provide explicit institutional affiliations or email addresses for the authors. Only author names are listed at the beginning of the paper: 'Limei Wang , Kaveh Hassani , Si Zhang, Dongqi Fu, Baichuan Yuan, Weilin Cong, Zhigang Hua, Hao Wu, Ning Yao, Bo Long'. Therefore, it is impossible to classify the affiliation type based on the provided text.
Pseudocode	Yes	B MODEL DETAILS Algorithm 1 Graph Tokenizer 1: Input: Graph g = (V, E, X), Graph Encoder GNNθ, Residual Quantizer RQΦ, BGRL Loss RQΦ
Open Source Code	Yes	The implementation is publicly available at https://github.com/limei0307/GQT.
Open Datasets	Yes	We use four datasets from the Long-Range Graph Benchmark (LRGB) (Dwivedi et al., 2022b) ... Homophilic Node Classification. We use eight medium-scale homophilic datasets including: Cora Full (Bojchevski & Günnemann, 2017), Cite Seer, Pub Med (Yang et al., 2016), Amazon Computers, Amazon Photos, Co-author CS, Co-author Physics (Shchur et al., 2018), and Wiki CS (Mernyei & Cangea, 2020). ... All datasets are publicly available.
Dataset Splits	Yes	For Cora Full, Pubmed, Pub Med, Computer, Photo, CS, and Physics, we follow previous work and use 60%/20%/20% train/valid/test split. For Wi Ki CS, we follow the official split in Mernyei & Cangea (2020). For Squirrel, Chameleon, Amazon-Ratings, Roman-Empire, Minesweeper, and Questions, we follow the splits in Platonov et al. (2023). For ogbn-proteins, ogbn-arxiv, and ogbn-products, we follow the splits in Hu et al. (2020a). For pokec, we follow the split used in Lim et al. (2021). For Peptides-Func, Peptides-Struct, COCO-SP, and PCQM-Contact, we follow the split provided in Dwivedi et al. (2022b).
Hardware Specification	Yes	All experiments are conducted on a single Nvidia A100 GPU.
Software Dependencies	No	GQT is implemented using Py Torch2, Py G3, DGL4, and the vector-quantize-pytorch package5. Most datasets can be accessed through Py G and DGL. The paper lists software libraries used (PyTorch, PyG, DGL, vector-quantize-pytorch) but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	D EXPERIMENTAL SETUP ... We provide the hyperparameters and experimental details for each part below. ... tune the number of layers from {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} and hidden dimensions from {128, 256, 512, 1024}. For the quantizer, we use residual-VQ (RVQ) (Lee et al., 2022) and tune the number of codebooks from {1, 2, 3, 6, 9} and the codebook size from {128, 256, 512, 1024, 2048, 4096}. ... For the Transformer model, we use the Transformer Encoder module in Py Torch as our backbone, and tune the number of layers from{1, 2, 3, 4, 5, 6}, the number of heads from {4, 8}, and the feedforward dimension from {512, 1024, 2048}.