A Theory of Usable Information under Computational Constraints

Authors: Yilun Xu, Shengjia Zhao, Jiaming Song, Russell Stewart, Stefano Ermon

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we demonstrate predictive V-information is more effective than mutual information for structure learning and fair representation learning.
Researcher Affiliation Academia Yilun Xu CFCS, Peking University xuyilun@pku.edu.cn Shengjia Zhao Stanford University sjzhao@stanford.edu Jiaming Song Stanford University tsong@cs.stanford.edu Russell Stewart russell.sb.nebel@gmail.com Stefano Ermon Stanford University ermon@cs.stanford.edu
Pseudocode Yes Algorithm 1 Construct Chow-Liu Trees with V-Information
Open Source Code No The paper does not contain any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We evaluate V-information on the in-silico dataset from the DREAM5 challenge (Marbach et al., 2012)... Let X1, , X20 be random variables each representing a frame in videos from the Moving-MNIST dataset... We use a function family Vj as the attacker to extract information from features trained with IVi(Z U) minimization, where all the Vs are neural nets. On three datasets commonly used in the fairness literature (Adult, German, Heritage)...
Dataset Splits No The paper mentions using 'amount of training data' and 'different fractions of data used for estimation' but does not specify explicit dataset splits (e.g., percentages or counts) for training, validation, or testing.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies No The paper mentions using certain software components like 'conditional Pixel CNN++' and 'neural network architecture', but it does not provide specific version numbers for any software dependencies.
Experiment Setup Yes We use V-information(Gaussian) and V-information(Logistic) to denote Algorithm 1 with two different V families... We select V = {f : f[x] = N(g(x), 1/2), x X; f[ ] = N(µ, 1/2)|µ range(g)}, where g is a 3-rd order polynomial... VB = {f : f[z] = softmax(g(z))}, where g is a two-layer MLP with Relu as the activation function. VC = {f : f[z] = softmax(g(z))}, where g is a three-layer MLP with Leaky Relu as the activation function.