Telescoping Density-Ratio Estimation

Authors: Benjamin Rhodes, Kai Xu, Michael U. Gutmann

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate that TRE can yield substantial improvements over existing single-ratio methods for mutual information estimation, representation learning and energy-based modelling. We empirically demonstrate that TRE can accurately estimate density-ratios using deep neural networks on high-dimensional problems, significantly outperforming existing single-ratio methods.
Researcher Affiliation Academia Benjamin Rhodes School of Informatics University of Edinburgh ben.rhodes@ed.ac.uk; Kai Xu School of Informatics University of Edinburgh kai.xu@ed.ac.uk; Michael U. Gutmann School of Informatics University of Edinburgh michael.gutmann@ed.ac.uk
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described, such as a specific repository link or an explicit code release statement.
Open Datasets Yes We applied TRE to the Spatial Multi Omniglot problem taken from [49] and learning energy-based models of the MNIST handwritten digit dataset [37].
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning. While it mentions 'training', explicit split details are missing.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions various software components and models used (e.g., 'Res Nets', 'RQ-NSF'), but it does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers like Python 3.8, PyTorch 1.9) needed to replicate the experiments.
Experiment Setup Yes For the linear combination mechanism, we collapse the αk into a single spacing hyperparameter, and grid-search over this value, along with the number of waymarks. Details are in the appendix. Each bridge in TRE uses a separable architecture [52] given by log rk(u, v) = g(u)T Wkfk(v), where g and fk are 14-layer convolutional Res Nets [24] and fk uses the parameter-sharing scheme described in Section 3.2. We use the parameter sharing scheme from Section 3.2 together with quadratic heads. This gives log rk(x) = fk(x)T Wkfk(x) fk(x)T bk ck, where we set fk to be an 18-layer convolutional Resnet and constrain Wk to be positive definite.