Transformer Meets Boundary Value Inverse Problems
Authors: Ruchi Guo, Shuhao Cao, Long Chen
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we present some experimental results to show the quality of the reconstruction. The benchmark contains sampling of inclusions of random ellipses (targets), and the input data has a single channel (L = 1) of the 2D harmonic extension feature from the 1D boundary measurements. The training uses 1cycle and a mini-batch ADAM for 50 epochs. The evaluated model is taken from the epoch with the best validation metric on a reserved subset. There are several baseline models to compare: the CNN-based U-Nets (Ronneberger et al., 2015; Guo & Jiang, 2020); the state-of-the-art operator learner Fourier Neural Operator (FNO) (Li et al., 2021a) and its variant with a token-mixing layer (Guibas et al., 2022); Multi Wavelet Neural Operator (MWO) (Gupta et al., 2021). The Transformer model of interest is a drop-in replacement of the baseline U-Net, and it is named by U-Integral Transformer (UIT). UIT uses the kernel integral inspired attention (11), and we also compare UIT with the linear attention-based Hybrid U-Transformer in Gao et al. (2021), as well as a Hadamard product-based cross-attention U-Transformer in Wang et al. (2022). An ablation study is also conducted by replacing the convolution layers in the U-Net with attention (11) on the coarsest level. For more details of the hyperparameters setup in the data generation, training, evaluation, network architectures please refer to Section 3.1, Appendix C.1, and Appendix C.2. |
| Researcher Affiliation | Academia | Ruchi Guo Department of Mathematics University of California, Irvine Shuhao Cao Division of Computing, Analytics, and Mathematics School of Science and Engineering University of Missouri-Kansas City Long Chen Department of Mathematics University of California, Irvine |
| Pseudocode | No | The paper describes methods in text and mathematical formulas but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Additionally, we provide the Py Torch (Paszke et al., 2019) code for reproducing our results at https://github.com/scaomath/eit-transformer. |
| Open Datasets | Yes | The dataset used in this paper is available at https://www.kaggle.com/datasets/scaomath/eletrical-impedance-tomography-dataset. |
| Dataset Splits | Yes | There are 10800 samples in the training set, from which 20% are reserved as validation. There are 2000 in the testing set for evaluation. |
| Hardware Specification | Yes | All models are trained on an RTX 3090 or an A4000. |
| Software Dependencies | No | Additionally, we provide the Py Torch (Paszke et al., 2019) code for reproducing our results at https://github.com/scaomath/eit-transformer. A specific version number for PyTorch or any other software dependency is not explicitly stated. |
| Experiment Setup | Yes | The training uses 1cycle (Smith & Topin, 2019) learning rate strategy with a warm-up phase. A mini-batch ADAM iterations are run for 50 epochs with no extra regularization, such as weight decay. The evaluated model is taken from the epoch that has the best validation metric. The learning rate starts and ends with 10 3 lrmax, and reaches the maximum of lrmax at the end of the 10-th epoch. The lrmax = 10 3. |