Diffeomorphic Information Neural Estimation

Authors: Bao Duong, Thin Nguyen

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our numerical experiments show that the proposed DINE estimator can consistently outperforms the state-of-the-arts in both the MI and CMI estimation tasks. We also apply DINE to test for conditional independence (CI) an important statistical problem where the presence of the conditioning variable is a major obstacle, in which the empirical results indicate that the distinctively accurate CMI estimation of DINE allows for a high-accuracy test. The empirical evaluations show that DINE consistently outperforms competitors in all tasks and is able to adapt very well to complex and high-dimensional relationships.
Researcher Affiliation Academia Bao Duong, Thin Nguyen Applied Artificial Intelligence Institute, Deakin University, Australia {duongng,thin.nguyen}@deakin.edu.au
Pseudocode No The paper describes the proposed DINE framework and its components using mathematical formulations and conceptual descriptions. However, it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format.
Open Source Code Yes Source code and relevant data sets are available at https://github.com/baosws/DINE
Open Datasets No The paper uses 'Synthetic Data' for its experiments, stating: 'For each independent simulation, we first generate two jointly multivariate Gaussian variables X , Y with same dimensions d X = d Y = d and shared component-wise correlation, i.e.,...'. Since the data is generated for the purpose of the experiment, it is not a pre-existing publicly available dataset with concrete access information.
Dataset Splits No The paper describes sample sizes (e.g., 'low sample size n = 200 and the large sample size n = 1000') used for experiments and mentions 'n i.i.d. samples'. However, it does not specify explicit train, validation, or test splits, percentages, or methodology for partitioning the data. The data is generated and then used for evaluation without explicit splitting for training, validation, and testing purposes.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments. No GPU, CPU models, memory, or cloud instance types are mentioned.
Software Dependencies No The paper mentions general concepts like 'neural networks' and 'normalizing flows' and refers to the adoption of implementations for baselines (e.g., MINE, MIND, CCMI, KSG, KCIT, CCIT) from their respective authors' repositories. However, it does not list any specific software dependencies (e.g., Python, PyTorch, TensorFlow) with version numbers for their own DINE implementation.
Experiment Setup No The paper describes the synthetic data generation process and the parameters varied (e.g., correlation ρ, sample size n, dimensionality d). It also states: 'Implementation details and parameters selection of all methods are given in the Supplementary Material (Duong and Nguyen 2022b).' However, specific hyperparameters (like learning rate, batch size, number of epochs, optimizer settings) for training the neural networks in DINE are not provided in the main text of the paper.