Learning Discrete Structured Representations by Adversarially Maximizing Mutual Information

Authors: Karl Stratos, Sam Wiseman

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply our model on document hashing and show that it outperforms current best baselines based on discrete and vector quantized variational autoencoders.
Researcher Affiliation Academia 1Rutgers University 2Toyota Technological Institute at Chicago. Correspondence to: Karl Stratos <karlstratos@gmail.com>.
Pseudocode Yes Algorithm 1 Cross Entropy
Open Source Code Yes 1Code: https://github.com/karlstratos/ammi
Open Datasets Yes raw document representation y is a high-dimensional TFIDF vector computed from preprocessed corpora (TMC, NG20, and Reuters) provided by Chaidaroon et al. (2018).
Dataset Splits Yes We construct such article pairs from the Who-Did-What dataset (Onishi et al., 2016). We remove all overlapping articles so that each article appears only once in the entire training/validation/test data containing 104404/8928/7326 document pairs.
Hardware Specification Yes We use an NVIDIA Quadro RTX 6000 with 24GB memory.
Software Dependencies No The paper mentions using 'Adam' for optimization but does not provide specific version numbers for any software dependencies or libraries (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup Yes We find that an effective range of values is: initialization α = 0.1, batch size N {16, 32, 64, 128}, adversarial step G {1, 2, 4}, adversarial learning rate η {0.03, 0.01, 0.003, 0.001}, learning rate η {0.03, 0.01, 0.003, 0.001, 0.0003, 0.0001}, and entropy weight β {1, 1.5, 2, 2.5, 3, 3.5}.