reproducibilityindex.ai

MatrixNet: Learning over symmetry groups using learned group representations

Authors: Lucas Laird, Circe Hsu, Asilata Bapat, Robin Walters

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We use two learning tasks to evaluate the four variants of Matrix Net and compare our approach against several baseline models. We use several finite groups on a well understood task as an initial test to validate our approach and then move on to an infinite group, the braid group B3, on a task related to open problems. As baselines, we compare to an MLP for fixed maximum sequence length and LSTM and Transformer models on longer sequences. Results of the experiments are summarized in Table 1.
Researcher Affiliation	Academia	Lucas Laird Khoury College of Computer Sciences Northeastern University Boston, MA 02115 laird.l@northeastern.edu Circe Hsu Department of Mathematics Northeastern University Boston, MA 02115 hsu.circe@northeastern.edu Asilata Bapat Mathematical Sciences Institute Australian National University Canberra, Australia asilata.bapat@anu.edu.au Robin Walters Khoury College of Computer Sciences Northeastern University Boston, MA 02115 r.walters@northeastern.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https://github.com/lucas-laird/Matrix Net.
Open Datasets	No	We generated a dataset of 500, 000 samples consisting of words of the free group F10, and labels corresponding to their order as elements of S10. An initial dataset of Jordan Hölder multiplicities for braid words up to length 6 was provided. We implemented a state automaton algorithm from [47] to generate additional examples for longer braid words. The paper does not provide a specific link, DOI, or repository name for these generated datasets.
Dataset Splits	Yes	The data was split into 60% training data, 20% validation data, and 20% test which were fixed for all models.
Hardware Specification	Yes	All of the categorical braid action experiments were run on a machine with a single Nvidia RTX 2080 ti GPU.
Software Dependencies	No	Sample order labels in S10 are computed using the SymPy package [50]. The paper does not list specific version numbers for other key software components like Python, PyTorch, or TensorFlow.
Experiment Setup	Yes	All of the models trained using an Adam optimizer with a learning rate of 1e 4 and a batch size of 128. The chosen parameters for the models are: Matrix Net: Single channel 14 14 matrix size; Matrix Net-LN: Single channel 10 10 matrix size, 128 dimensions for linear network in the matrix block; Matrix Net-MC: 3-channel 8 8 matrix size; Matrix Net-NL: Single channel 10 10 matrix size, 128 hidden dimensions and a tanh non-linearity between linear layers of matrix block; MLP: 3-layer MLP with 128 hidden dimensions for each layer and Re LU activation functions followed by a single linear layer output; LSTM: 6 LSTM layers with 16 dimensional input embeddings and 32 hidden dimensions followed by a 2-layer MLP classifier with 64 hidden dimensions and Re LU activation; Transformer: 3 transformer layers with 4 attention heads, 16 dimensional embeddings and 32 hidden dimensions. Used mean pooling and a single linear layer output. All of the Matrix Net architectures used a 2-layer MLP with 128 hidden dimensions and Re LU activation to compute output after the matrix block.