Group and Shuffle: Efficient Structured Orthogonal Parametrization
Authors: Mikhail Gorbunov, Nikolay Yudin, Vera Soboleva, Aibek Alanov, Alexey Naumov, Maxim Rakhuba
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically validate our method on different domains, including adapting of text-to-image diffusion models and downstream task fine-tuning in language modeling. Additionally, we adapt our construction for orthogonal convolutions and conduct experiments with 1-Lipschitz neural networks. |
| Researcher Affiliation | Academia | Mikhail Gorbunov HSE University gorbunovmikh73@gmail.com Nikolay Yudin HSE University Vera Soboleva AIRI, HSE University Aibek Alanov AIRI, HSE University Alexey Naumov HSE University, Steklov Mathematical Institute of Russian Academy of Sciences Maxim Rakhuba HSE University |
| Pseudocode | Yes | Algorithm 1 Projection π( ) of A onto GS(PL, P, PR) |
| Open Source Code | Yes | Source code is available at: https://github.com/Skonor/group_and_shuffle |
| Open Datasets | Yes | We report result on the GLUE [Wang et al., 2018] benchmark with Ro BERTa-base [Liu et al., 2019] model. We use Stable Diffusion [Rombach et al., 2022] and the Dreambooth [Ruiz et al., 2023] dataset for all our experiments. Following [Singla and Feizi, 2021], we train Lip Convnet-n on CIFAR-100 dataset. |
| Dataset Splits | Yes | We follow training settings of [Liu et al., 2024b, Zhang et al., 2023]. We report best results on the evaluation set from the whole training. |
| Hardware Specification | Yes | All the experiments below were conducted on NVIDIA V100-SXM2-32Gb GPU. |
| Software Dependencies | No | The paper mentions the use of 'PEFT library' but does not provide specific version numbers for software dependencies. |
| Experiment Setup | Yes | All the models are trained using Adam optimizer with batch size = 4, learning rate = 0.00002, betas = (0.9, 0.999) and weight decay = 0.01. |