Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Global Convergence Rate of Deep Equilibrium Models with General Activations
Authors: Lan V. Truong
TMLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This compelling result is further supported by our numerical experiments on the MNIST and CIFAR-10 datasets. |
| Researcher Affiliation | Academia | Lan V. Truong EMAIL School of Mathematics, Statistics and Actuarial Science University of Essex |
| Pseudocode | Yes | A weight initialisation algorithm (WIALG) is as follows. Initialise: m = n, σ2 w = 1 96L2 . Generate a matrix W Rm m where Wij N 0, 2σ2 w m . Generate a matrix U Rm d where Uij N 0, 2 m . Generate a vector a Rm where ai N 0, 1 m . Find a fixed-point T of the equation T = φ(WT+UX) by using Anderson acceleration method Walker & Ni (2011). |
| Open Source Code | No | The paper does not provide any explicit statement about open-sourcing the code or a link to a code repository. |
| Open Datasets | Yes | In this section, we conduct experiments to validate Theorem 3. Specifically, we evaluate the performance of the DEQ model on the MNIST and CIFAR-10 datasets. |
| Dataset Splits | No | The paper mentions using MNIST and CIFAR-10 datasets and normalizing data points, but does not specify training, validation, or test splits, or any methodology for creating them. |
| Hardware Specification | No | The paper does not specify any hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers used for the experiments. |
| Experiment Setup | No | While the paper describes varying parameters such as 'm' and activation functions, it lacks specific details regarding hyperparameters for the numerical experiments, such as the exact learning rate used, batch size, optimizer, or number of epochs. |