Augment with Care: Contrastive Learning for Combinatorial Problems
Authors: Haonan Duan, Pashootan Vaezipoor, Max B Paulus, Yangjun Ruan, Chris Maddison
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct a scientific study of the effect of augmentation design on contrastive pretraining for the Boolean satisfiability problem. |
| Researcher Affiliation | Academia | 1University of Toronto 2Vector Institute 3ETH Zürich. |
| Pseudocode | No | The paper describes the framework's components and processes, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/ h4duan/contrastive-sat. |
| Open Datasets | Yes | Datasets. We experimented using four generators: SR (Selsam et al., 2018), Power Random 3SAT (PR) (Ansótegui et al., 2009), Double Power (DP) and Popularity Similarity (PS) (Giráldez-Cru & Levy, 2017). |
| Dataset Splits | Yes | We generated 100 separate labelled instances to train our linear evaluators, and another 500 as the validation set to pick the hyperparameters (ranging from 10 3 to 103) of L2 regularization. We used 200 instances as validation sets for early stopping of all methods. |
| Hardware Specification | No | The paper does not specify the hardware used for running experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions software like 'Adam optimizer', 'sklearn', 'Neuro SAT', and 'Crypto Mini Sat solvers', but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | Architecture. We primarily used the encoder of Neuro SAT (Selsam et al., 2018) as the GNN architecture. [...] The dimension of literal representations was chosen to be 128. [...] Experimental Setting. We used the contrastive loss in Equation 1 with the temperature 0.5. For the projection head, we used a 2-layer MLP, with the dimension of hidden and output layer being 64. We used Adam optimizer with learning rate 2 10 4 and weight decay 10 5. The batch size was 128 and the maximum training epoch was 5000. |