Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning
Authors: Yuxin Tang, Zhimin Ding, Dimitrije Jankov, Binhang Yuan, Daniel Bourgeois, Chris Jermaine
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show experimentally that a relational engine running an auto-differentiated relational algorithm can easily scale to very large datasets, and is competitive with state-of-the-art, special-purpose systems for large-scale distributed machine learning. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Rice University, Houston, US 2ETH Zurich, Switzerland. |
| Pseudocode | Yes | Algorithm 1 Chain Rule (vi, vj, Q Rj , R1, ..., Rk ) ... Algorithm 2 RAAuto Diff (Q, In1, In2, ... ) |
| Open Source Code | Yes | We provide a simple python tool can be used for RA auto-differentiation: https://github. com/anonymous-repo-33/relation-algebra-autodiff |
| Open Datasets | Yes | This GCN is benchmarked using the datasets in Table 1. ogbn-arxiv (0.2M, 1.1M) ogbn-products (0.1M, 39M) ogbn-papers100M (0.1B, 1.6B) friendster (65.6M, 3.6B) ... We train our KGE model on the Freebase data set. Freebase (Chah, 2017) contains 1.9 billion triples in RDF format; |
| Dataset Splits | Yes | We split the dataset into a training set (90%), a validation set (5%), and a testing set (5%). |
| Hardware Specification | Yes | Experiments are run on AWS, using m5.4xlarge instances with 20 cores, 64GB DDR4 memory, and 1TB general SSD. |
| Software Dependencies | Yes | DGL is built from the latest version 0.9 from scratch. |
| Experiment Setup | Yes | The Adam optimizer is used with learning rate η = 0.1; the dropout rate γ = 0.5; the hidden layer dimension D = 256; batch size B = 1024. |