reproducibilityindex.ai

Mutual Information Estimation via $f$-Divergence and Data Derangements

Authors: Nunzio Alexandro Letizia, Nicola Novello, Andrea M Tonello

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The comparison with state-of-the-art neural estimators, through extensive experimentation within established reference scenarios, shows that our approach offers higher accuracy and lower complexity. 38th Conference on Neural Information Processing Systems (Neur IPS 2024). 6 Experimental Results In this section, we firstly describe the architectures of the proposed estimators. Then, we outline the data used to estimate the MI, comment on the performance of the discussed estimators in different scenarios, also analyzing their computational complexity. Finally, we present the outcomes of the self-consistency tests [20] over image datasets.
Researcher Affiliation	Academia	Nunzio A. Letizia Nicola Novello Andrea M. Tonello University of Klagenfurt {nunzio.letizia,nicola.novello,andrea.tonello}@aau.at
Pseudocode	No	No explicit pseudocode or algorithm blocks were found in the paper. Methods are described textually or mathematically.
Open Source Code	Yes	Our implementation can be found at https://github.com/tonellolab/f DIME
Open Datasets	Yes	We use the images collected in MNIST [33] and Fashion MNIST [34] data sets. In the first setting (called Gaussian), a multidimensional Gaussian distribution is sampled to obtain x and n samples, independently. In the second setting (referred to as cubic), the nonlinear transformation y 7 y3 is applied to the Gaussian samples.
Dataset Splits	No	The paper does not explicitly provide training/test/validation dataset splits, but rather describes how data is generated (e.g., Gaussian, cubic) or refers to standard datasets like MNIST/Fashion MNIST where splits are common knowledge but not explicitly stated here.
Hardware Specification	Yes	A fundamental characteristic of each algorithm is the computational time. The computational time analysis is developed on a server with CPU AMD Ryzen Threadripper 3960X 24-Core Processor and GPU MSI Ge Force RTX 3090 Gaming X Trio 24G, 24GB GDDR6X .
Software Dependencies	Yes	We implemented a Pytorch [31] version of the code produced by the authors of [24] 3, to unify NJEE with all the other MI estimators. Each neural estimator is trained using Adam optimizer [32], with learning rate 5 10 4, β1 = 0.9, β2 = 0.999.
Experiment Setup	Yes	Each neural network is trained for 4k iterations for each stair step, with a batch size of 64 samples (N = 64). Each neural estimator is trained using Adam optimizer [32], with learning rate 5 10 4, β1 = 0.9, β2 = 0.999. The batch size is initially set to N = 64.