A Deep Conjugate Direction Method for Iteratively Solving Linear Systems

Authors: Ayano Kaneda, Osman Akar, Jingyu Chen, Victoria Alicia Trevino Kala, David Hyde, Joseph Teran

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the efficacy of our approach on spatially discretized Poisson equations, which arise in computational fluid dynamics applications, with millions of degrees of freedom. Unlike state-of-the-art learning approaches, our algorithm is capable of reducing the linear system residual to a given tolerance in a small number of iterations, independent of the problem size. Moreover, our method generalizes effectively to various systems beyond those encountered during training.
Researcher Affiliation Academia 1Department of Applied Physics, Waseda University, Tokyo, Japan 2Department of Mathematics, University of California, Los Angeles, USA 3Department of Computer Science, Vanderbilt University, Nashville, USA 4Department of Mathematics, University of California, Davis, USA.
Pseudocode Yes Algorithm 1 DCDM
Open Source Code Yes We release our code, data, and pre-trained models so users can immediately apply DCDM to Poisson systems without further dataset generation or training, especially due to the feasibility of pretrained weights for inference at different grid resolutions: https://github.com/ayano721/2023_DCDM.
Open Datasets Yes We create the training dataset D span(Atrain) Sn 1 of size 20,000 generated from 10,000 Rayleigh-Ritz vectors. ... We release our code, data, and pre-trained models so users can immediately apply DCDM to Poisson systems without further dataset generation or training, especially due to the feasibility of pretrained weights for inference at different grid resolutions: https://github.com/ayano721/2023_DCDM.
Dataset Splits No The paper mentions 'Training and validation losses' and that 'the model from the third epoch was optimal for 128^3' implying the use of a validation set. However, it does not specify explicit dataset splits (e.g., percentages or sample counts) for training, validation, or testing.
Hardware Specification Yes Training is done with standard deep learning techniques more precisely, back-propagation and the ADAM optimizer (Kingma & Ba, 2015) (with starting learning rate 0.0001). ... All examples were run on a workstation with dual stock AMD EPYC 75F3 processors, and an NVIDIA RTX A6000 GPU with 48GB memory.
Software Dependencies No The paper mentions using 'TensorFlow' and 'SciPy' but does not specify their version numbers. For example: 'We train our model with Tensor Flow (Abadi et al., 2015)' and 'We used Sci Py s (Virtanen et al., 2020) sparse.linalg.spsolve triangular function'.
Experiment Setup Yes Training is done with standard deep learning techniques more precisely, back-propagation and the ADAM optimizer (Kingma & Ba, 2015) (with starting learning rate 0.0001).