reproducibilityindex.ai

Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning

Authors: Yuxin Tang, Zhimin Ding, Dimitrije Jankov, Binhang Yuan, Daniel Bourgeois, Chris Jermaine

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show experimentally that a relational engine running an auto-differentiated relational algorithm can easily scale to very large datasets, and is competitive with state-of-the-art, special-purpose systems for large-scale distributed machine learning.
Researcher Affiliation	Academia	1Department of Computer Science, Rice University, Houston, US 2ETH Zurich, Switzerland.
Pseudocode	Yes	Algorithm 1 Chain Rule (vi, vj, Q Rj , R1, ..., Rk ) ... Algorithm 2 RAAuto Diff (Q, In1, In2, ... )
Open Source Code	Yes	We provide a simple python tool can be used for RA auto-differentiation: https://github. com/anonymous-repo-33/relation-algebra-autodiff
Open Datasets	Yes	This GCN is benchmarked using the datasets in Table 1. ogbn-arxiv (0.2M, 1.1M) ogbn-products (0.1M, 39M) ogbn-papers100M (0.1B, 1.6B) friendster (65.6M, 3.6B) ... We train our KGE model on the Freebase data set. Freebase (Chah, 2017) contains 1.9 billion triples in RDF format;
Dataset Splits	Yes	We split the dataset into a training set (90%), a validation set (5%), and a testing set (5%).
Hardware Specification	Yes	Experiments are run on AWS, using m5.4xlarge instances with 20 cores, 64GB DDR4 memory, and 1TB general SSD.
Software Dependencies	Yes	DGL is built from the latest version 0.9 from scratch.
Experiment Setup	Yes	The Adam optimizer is used with learning rate η = 0.1; the dropout rate γ = 0.5; the hidden layer dimension D = 256; batch size B = 1024.