Efficient Learning of Discrete-Continuous Computation Graphs
Authors: David Friede, Mathias Niepert
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | With an extensive set of experiments, we show that we can train complex discrete-continuous models which one cannot train with standard stochastic softmax tricks. We also show that complex discrete-stochastic models generalize better than their continuous counterparts on several benchmark datasets. 4 Experiments The aim of the experiments is threefold. |
| Researcher Affiliation | Collaboration | David Friede1,2 david@informatik.uni-mannheim.de 2University of Mannheim Mannheim, Germany Mathias Niepert1 mathias.niepert@neclab.eu 1NEC Laboratories Europe Heidelberg, Germany |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The implementations are in Py Torch and can be found at https://github.com/nec-research/dccg. |
| Open Datasets | Yes | Unsupervised Parsing on List Ops The Listops dataset contains sequences in prefix arithmetic syntax such as max[ 2 9 min[ 4 7 ] 0 ] and its unique numerical solutions (here: 9) [22]. ... Multi-Hop Reasoning over Knowledge Graphs Here we consider the problem of answering multi-hop (path) queries in knowledge graphs (KGs) [10]. We evaluate various approaches on the standard benchmarks for path queries [10].2 ... End-to-End Learning of MNIST Addition The MNIST addition task addresses the learning problem of simultaneously (i) recognizing digits from images and (ii) performing the addition operation on the digit s numerical values [20]. |
| Dataset Splits | No | The paper mentions using specific datasets and generating test examples for extrapolation but does not provide explicit numerical details (percentages or counts) for train/validation/test dataset splits, nor does it refer to specific predefined splits with quantitative information. |
| Hardware Specification | Yes | All experiments were run on a Ge Force RTX 2080 Ti GPU. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not specify its version number or the version numbers of any other software dependencies. |
| Experiment Setup | Yes | All models are run for 100 epochs with a learning rate of 0.005 and we select τ {1, 2, 4}. We choose a dimension of 256, a batch size of 512 and learning rate of 0.001. We further train 1vs All with the cross-entropy loss for 200 epochs and with a temperature of τ = 4. We use a learning rate of 0.0001 and a temperature of τ = 8. |