reproducibilityindex.ai

Modeling Structure with Undirected Neural Networks

Authors: Tsvetomila Mihaylova, Vlad Niculae, Andre Martins

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of undirected neural architectures, both unstructured and structured, on a range of tasks: tree-constrained dependency parsing, convolutional image classification, and sequence completion with attention.
Researcher Affiliation	Collaboration	1Instituto de Telecomunicações, Instituto Superior Técnico, Lisbon, Portugal 2Language Technology Lab, University of Amsterdam, The Netherlands 3LUMLIS, Lisbon ELLIS Unit, Portugal 4Unbabel, Lisbon, Portugal.
Pseudocode	No	The paper describes algorithms using mathematical equations and textual explanations but does not contain a formally labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	The source code is on https://github.com/deep-spin/unn.
Open Datasets	Yes	We demonstrate this on the MNIST dataset of handwritten digits (Deng, 2012) and We test the architecture on several datasets from Universal Dependencies 2.7 (Zeman et al., 2020).
Dataset Splits	Yes	The learning rate for each language is chosen via grid search for highest UAS on the validation set for the baseline model. and splitting them into training and test sets with around 706K and 78K instances.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper mentions using Adam optimizer and PyTorch functions (e.g., 'torch.conv2d'), but does not specify version numbers for these or any other software dependencies.
Experiment Setup	Yes	We use γ = .1 and an Adam learning rate of .0005. and Adam with learning rate 10 4. The hidden dimension is d = 256, and gradients with magnitude beyond 10 are clipped. and The learning rate for each language is chosen via grid search for highest UAS on the validation set for the baseline model. We searched over the values {0.1, 0.5, 1, 5, 10} 10 5. In the experiments, we use 10 5 for Italian and 5 10 5 for the other languages. We employ dropout regularization, using the same dropout mask for each variable throughout the inner coordinate descent iterations, so that dropped values do not leak.