Learning Discrete Structured Representations by Adversarially Maximizing Mutual Information
Authors: Karl Stratos, Sam Wiseman
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply our model on document hashing and show that it outperforms current best baselines based on discrete and vector quantized variational autoencoders. |
| Researcher Affiliation | Academia | 1Rutgers University 2Toyota Technological Institute at Chicago. Correspondence to: Karl Stratos <karlstratos@gmail.com>. |
| Pseudocode | Yes | Algorithm 1 Cross Entropy |
| Open Source Code | Yes | 1Code: https://github.com/karlstratos/ammi |
| Open Datasets | Yes | raw document representation y is a high-dimensional TFIDF vector computed from preprocessed corpora (TMC, NG20, and Reuters) provided by Chaidaroon et al. (2018). |
| Dataset Splits | Yes | We construct such article pairs from the Who-Did-What dataset (Onishi et al., 2016). We remove all overlapping articles so that each article appears only once in the entire training/validation/test data containing 104404/8928/7326 document pairs. |
| Hardware Specification | Yes | We use an NVIDIA Quadro RTX 6000 with 24GB memory. |
| Software Dependencies | No | The paper mentions using 'Adam' for optimization but does not provide specific version numbers for any software dependencies or libraries (e.g., Python, PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | Yes | We find that an effective range of values is: initialization α = 0.1, batch size N {16, 32, 64, 128}, adversarial step G {1, 2, 4}, adversarial learning rate η {0.03, 0.01, 0.003, 0.001}, learning rate η {0.03, 0.01, 0.003, 0.001, 0.0003, 0.0001}, and entropy weight β {1, 1.5, 2, 2.5, 3, 3.5}. |