Diffeomorphic Information Neural Estimation
Authors: Bao Duong, Thin Nguyen
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our numerical experiments show that the proposed DINE estimator can consistently outperforms the state-of-the-arts in both the MI and CMI estimation tasks. We also apply DINE to test for conditional independence (CI) an important statistical problem where the presence of the conditioning variable is a major obstacle, in which the empirical results indicate that the distinctively accurate CMI estimation of DINE allows for a high-accuracy test. The empirical evaluations show that DINE consistently outperforms competitors in all tasks and is able to adapt very well to complex and high-dimensional relationships. |
| Researcher Affiliation | Academia | Bao Duong, Thin Nguyen Applied Artificial Intelligence Institute, Deakin University, Australia {duongng,thin.nguyen}@deakin.edu.au |
| Pseudocode | No | The paper describes the proposed DINE framework and its components using mathematical formulations and conceptual descriptions. However, it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format. |
| Open Source Code | Yes | Source code and relevant data sets are available at https://github.com/baosws/DINE |
| Open Datasets | No | The paper uses 'Synthetic Data' for its experiments, stating: 'For each independent simulation, we first generate two jointly multivariate Gaussian variables X , Y with same dimensions d X = d Y = d and shared component-wise correlation, i.e.,...'. Since the data is generated for the purpose of the experiment, it is not a pre-existing publicly available dataset with concrete access information. |
| Dataset Splits | No | The paper describes sample sizes (e.g., 'low sample size n = 200 and the large sample size n = 1000') used for experiments and mentions 'n i.i.d. samples'. However, it does not specify explicit train, validation, or test splits, percentages, or methodology for partitioning the data. The data is generated and then used for evaluation without explicit splitting for training, validation, and testing purposes. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments. No GPU, CPU models, memory, or cloud instance types are mentioned. |
| Software Dependencies | No | The paper mentions general concepts like 'neural networks' and 'normalizing flows' and refers to the adoption of implementations for baselines (e.g., MINE, MIND, CCMI, KSG, KCIT, CCIT) from their respective authors' repositories. However, it does not list any specific software dependencies (e.g., Python, PyTorch, TensorFlow) with version numbers for their own DINE implementation. |
| Experiment Setup | No | The paper describes the synthetic data generation process and the parameters varied (e.g., correlation ρ, sample size n, dimensionality d). It also states: 'Implementation details and parameters selection of all methods are given in the Supplementary Material (Duong and Nguyen 2022b).' However, specific hyperparameters (like learning rate, batch size, number of epochs, optimizer settings) for training the neural networks in DINE are not provided in the main text of the paper. |