Optimization inspired Multi-Branch Equilibrium Models
Authors: Mingjie Li, Yisen Wang, Xingyu Xie, Zhouchen Lin
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Furthermore, we also implement various experiments to demonstrate the effectiveness of our new methods and explore the applicability of the model s hidden objective function. |
| Researcher Affiliation | Academia | 1 Key Lab. of Machine Perception (Mo E), School of Artificial Intelligence, Peking University 2 Institute for Artificial Intelligence, Peking University 3 Pazhou Lab, Guangzhou, 510330, China. |
| Pseudocode | Yes | Algorithm 1: Forward Process of our MOpt Eqs implicit part. Algorithm 2: The Calculation of Tθ = z G + z at ( z, g(x)). |
| Open Source Code | No | The paper does not contain any explicit statements about making its source code publicly available, nor does it provide a link to a code repository. |
| Open Datasets | Yes | In this section, we conduct experiments for the image classification tasks on CIFAR10,CIFAR-100 Krizhevsky et al. (2009) and Image Nette implemented on the Py Torch (Paszke et al., 2017) platform to demonstrate the effectiveness of our model. |
| Dataset Splits | No | The paper mentions using CIFAR-10, CIFAR-100, and ImageNette datasets, but it does not explicitly specify the training/validation/test dataset splits (e.g., percentages or sample counts) or cross-validation methods used for reproduction. |
| Hardware Specification | Yes | (each batch contains 32 images and is finished on 2 x GTX 1080Ti for 200 epochs) |
| Software Dependencies | No | The paper mentions the use of "Py Torch (Paszke et al., 2017)" but does not provide a specific version number for PyTorch or any other software dependency. |
| Experiment Setup | Yes | For classification tasks of single-view models, we set the channel number of each branch to 32, λ = 0.01 and perturbation size ϵ = 0.1 for the small MOpt Eqs with three branches for the single-view comparison as Opt Eqs. The batch size is set to be 128 for all the experiments. We use the widely used SGD algorithm for training procedure. We set weight decay to be 0.001 and the initial learning rate start from 0.1 and decay by 0.1 at 100, 150, 175-th epoch with 200 epochs in total like others. |