Transformed Distribution Matching for Missing Value Imputation
Authors: He Zhao, Ke Sun, Amir Dezfouli, Edwin V. Bonilla
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments over a large number of datasets and competing benchmark algorithms show that our method achieves state-of-the-art performance. |
| Researcher Affiliation | Academia | 1CSIRO s Data61, Australia. Correspondence to: He Zhao <he.zhao@ieee.org>. |
| Pseudocode | Yes | Algorithm 1: TDM. Learnable parameters include missing values X[M] and parameters θ of f. input :Data X with missing values indicated by M output :X with X[M] imputed, fθ Initialise θ of f; # Initialise missing values with noisy mean # X[M] nanmean(X, dim=0) + N(0, 0.1) ; while Not converged do Sample two batches of B data samples X1 and X2; Feed X1 and X2 to fθ; for i = 1 . . . B, j = 1 . . . K do Compute G [i, j]; # Quadratic cost function # end Compute LW ; Update the missing values X1 2[M 1 2] and θ with gradient update; end |
| Open Source Code | Yes | Code at https://github.com/hezgit/TDM |
| Open Datasets | Yes | UCI datasets7 with different sizes are used in the experiments, the statistics of which are shown in Table 1 of the appendix. |
| Dataset Splits | Yes | We report the average accuracy of 5-fold cross-validations. |
| Hardware Specification | No | The paper mentions 'computing environment' when discussing running time but does not specify any particular hardware (e.g., GPU, CPU models, or memory) used for experiments. |
| Software Dependencies | No | The paper mentions 'Python/Numpy style matrix indexing', 'sklearn', and 'POT package' but does not specify version numbers for any of these software dependencies. |
| Experiment Setup | Yes | To minimise the loss in Eq. (8), we use RMSprop (Tieleman et al., 2012) as the optimiser with learning rate of 10 2 and batch size of 51210. ... The first one is the number of INN blocks T. ... We empirically find that T = 3 and K = 2 work well in practice ... We train our method for 10,000 iterations and report the performance based on the last iteration, which is the same for all the OT-based methods. |