Representing Unordered Data Using Complex-Weighted Multiset Automata
Authors: Justin DeBenedetto, David Chiang
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We carried out some experiments to test this hypothesis, using an open-source implementation of the Transformer, Witwicky.1 The settings used were the default settings, except that we used 8k joint BPE operations and d = 512 embedding dimensions. We tested the following variations on position encodings. |
| Researcher Affiliation | Academia | 1Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA. Correspondence to: Justin De Benedetto <jdebened@nd.edu>, David Chiang <dchiang@nd.edu>. |
| Pseudocode | No | The paper defines concepts and provides mathematical examples but does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available online.2 2https://github.com/jdebened/Complex Deep Sets |
| Open Datasets | No | The training set consisted of 100k randomly generated sequences of digits 1 9 with lengths from 1 to 50. |
| Dataset Splits | Yes | The training set consisted of 100k randomly generated sequences of digits 1 9 with lengths from 1 to 50. They were fed to each network in the order in which they were generated (which only affects GRU and LSTM). This was then split into training and dev with approximately a 99/1 split. The test set consisted of randomly generated sequences of lengths that were multiples of 5 from 5 to 95. |
| Hardware Specification | No | No specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running experiments were provided in the paper. |
| Software Dependencies | No | The paper mentions using 'open-source implementation of the Transformer, Witwicky' and refers to 'Deep Sets' model code, but does not specify version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For tasks 1 and 2, we used mean squared error loss, a learning rate decay of 0.5 after the validation loss does not decrease for 2 epochs, and early stopping after the validation loss does not decrease for 10 epochs. each input is fed into three separate embedding layers of size 50 (for r, a, and b). |