Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Neural incomplete factorization: learning preconditioners for the conjugate gradient method

Authors: Paul Häusner, Ozan Öktem, Jens Sjölund

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our proposed method on both synthetic problem instances and on problems arising from the discretization of the Poisson equation on varying domains. Our experiments show that by using data-driven preconditioners within the conjugate gradient method we are able to speed up the convergence of the iterative procedure.
Researcher Affiliation Academia Paul Häusner EMAIL Department of Information Technology, Uppsala University, Sweden Ozan Öktem EMAIL Department of Mathematics, KTH Royal Institute of Technology, Stockholm, Sweden Jens Sjölund EMAIL Department of Information Technology, Uppsala University, Sweden
Pseudocode Yes Algorithm 1 Preconditioned conjugate gradient method (Nocedal & Wright, 1999) ... Algorithm 2 Pseudo-code for Neural IF preconditioner.
Open Source Code Yes Our experiments show that by using data-driven preconditioners within the conjugate gradient method we are able to speed up the convergence of the iterative procedure. The code is available at https://github.com/paulhausner/neural-incomplete-factorization.
Open Datasets No We consider two different datasets in our experiments. The first dataset consists of synthetically generated problems where we can easily control the size and sparsity of the generated problem instances. The other class is motivated by problems arising in scientific computing by discretizing the Poisson PDE on varying grids using the finite element method. The details for the dataset generation can be found in Appendix A.
Dataset Splits Yes The datasets are summarized in Table 3. For the graph representation used in the learned preconditioner, the matrix size corresponds to the number of nodes in the graph and the number of non-zero elements corresponds to the number of edges connecting the nodes. In the following, the details for the problem generation are explained for each of the datasets. Table 3: Summary of the datasets used with some additional statistics on size of the matrices and the number of corresponding non-zero elements. Samples refer to number of generated problems in the train, validation and test set respectively. Dataset Samples Matrix size Non-zero elements (nnz) Sparsity Synthetic 1 000/10/100 10 000 1 000 000 99% Poisson train 750/15/300 20 000 150 000 800 000 500 000 > 99.9% Poisson test -/-/300 100 000 500 000 500 000 3 000 000 > 99.9%
Hardware Specification Yes For numerical experiments we use a single NVIDIA-Titan Xp with 12 GB memory. For baseline preconditioners, which are not able to be accelerated directly using GPUs, we use 6 Intel Core i7-6850K @ 3.60 GHz processors for the computations.
Software Dependencies No The paper mentions several software components like Py Torch (Paszke et al., 2019), Py Torch Geometric (Fey & Lenssen, 2019), numml (Nytko et al., 2022), ILU++ (Mayer, 2007; Hofreither, 2020), and scikit-fem (Gustafsson & Mc Bain, 2020). However, specific version numbers for these software components as used in their experiments are not provided in the text.
Experiment Setup Yes Both of our models as well as the data-driven Neural PCG baseline are trained for a total of 50 epochs. However, convergence can usually be observed significantly earlier. For the synthetic dataset we use a batch size of 5, while for the problems arising from the PDE discretization we only use a batch size of 1 due to resource constraints. This leads to a total training time of 40 minutes for the synthetic dataset and 55 minutes for the PDE dataset for each model. For all training schemes we utilize early stopping based on the validation set performance as an additional regularization measure. ... The Adam optimizer with initial learning rate 0.001 is used. Due to the small batch size and the loss landscape, we utilize gradient clipping to restrict the length of the allowed update steps and reduce the variance during stochastic gradient descent.