Scaling Structured Inference with Randomization
Authors: Yao Fu, John Cunningham, Mirella Lapata
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments over different graphs demonstrate the accuracy and efficiency of our approach. |
| Researcher Affiliation | Academia | 1School of Informatics, University of Edinburgh 2Statistics Department, Columbia University 3Zuckerman Institute, Columbia University. |
| Pseudocode | Yes | Algorithm 1 shows our randomized Forward algorithm for approximating the partition function of chain-structured graphs. Algorithm 2 shows our Randomized Inside algorithm for approximating the partition function of tree-structured hypergraphs. Algorithm 3 shows the randomized entropy DP. Algorithm 4 provides differentiable relaxed samples from HMMs and Linear-chain CRFs. |
| Open Source Code | Yes | Our implementation is at https://github.com/Franx Yao/RDP. |
| Open Datasets | Yes | We follow Fu et al. (2020) and use the MSCOCO dataset and reuse their processed data for simplicity. |
| Dataset Splits | No | The paper mentions using the MSCOCO dataset but does not specify the train/validation/test splits in terms of percentages or counts, nor does it refer to predefined splits with specific citations within the text. |
| Hardware Specification | Yes | With N = 2, 000, full DP gives memory overflow on a 16G GPU, so we only compare to the Top K approach. |
| Software Dependencies | No | The paper mentions compatibility with "Torch Struct in Rush, 2020" and using a "pretrained GPT2 (base size)" and an "LSTM" for models, but does not provide specific version numbers for any software or libraries like PyTorch, Python, or CUDA. |
| Experiment Setup | Yes | We set N, the number of states, to be 2,000 and 10,000. For all estimators, we set K2 = 1 and K1 = K − 1, and control K to be [1, 10, 20] percent of N. We use an LSTM with 256 dimensional hidden states for the generative model. We use K1 = K2 = 10%N. |