Computing Higher Order Derivatives of Matrix and Tensor Expressions
Authors: Soeren Laue, Matthias Mitterreiter, Joachim Giesen
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show a speedup of up to two orders of magnitude over state-of-the-art frameworks when evaluating higher order derivatives on CPUs and a speedup of about three orders of magnitude on GPUs. |
| Researcher Affiliation | Academia | Sören Laue Friedrich-Schiller-Universität Jena Germany soeren.laue@uni-jena.de Matthias Mitterreiter Friedrich-Schiller-Universität Jena Germany matthias.mitterreiter@uni-jena.de Joachim Giesen Friedrich-Schiller-Universität Jena Germany joachim.giesen@uni-jena.de |
| Pseudocode | No | The paper includes figures illustrating expression DAGs and tables detailing steps, but not formally labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | An interface to our framework for computing vector and matrix derivatives is available online at www.MatrixCalculus.org. |
| Open Datasets | No | The paper mentions problem types (quadratic functions, logistic regression, matrix factorization) and sets parameters (m=2n, k=5) but does not provide details on specific publicly available datasets or their access information. |
| Dataset Splits | No | The paper does not explicitly provide information about training/test/validation dataset splits. |
| Hardware Specification | Yes | The experiments were run in a pure CPU setting (Intel Xeon E5-2686, four cores) as well as in a pure GPU setting (NVIDIA Tesla V100), except for autograd, that does not provide GPU support. |
| Software Dependencies | Yes | We compare our framework to the state-of-the-art automatic differentiation frameworks Tensor Flow 1.10, Py Torch 0.4, Theano 1.0, and HIPS autograd 1.2 used with Python 3.6, that were all linked against Intel MKL. |
| Experiment Setup | Yes | We set m = 2n in the experiments. For the experiments, we set k = 5 and compute the gradient and Hessian with respect to U. |