Quantum Deep Equilibrium Models
Authors: Philipp Schleich, Marta Skreta, Lasse Kristensen, Rodrigo Vargas-Hernandez, Alan Aspuru-Guzik
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply QDEQs to find the parameters of a quantum circuit in two settings: the first involves classifying MNIST-4 digits with 4 qubits; the second extends it to 10 classes of MNIST, Fashion MNIST and CIFAR. Our code is available at https://github.com/martaskrt/qdeq. Section 3 Experiments, Section 4 Results. |
| Researcher Affiliation | Academia | Philipp Schleich Department of Computer Science University of Toronto Vector Institute, Marta Skreta Department of Computer Science University of Toronto Vector Institute, Lasse B. Kristensen Department of Computer Science University of Copenhagen, Rodrigo A. Vargas-Hernández Department of Chemistry & Chemical Biology Mc Master University, ON, Alán Aspuru-Guzik Department of Computer Science Department of Chemistry University of Toronto Vector Institute |
| Pseudocode | No | The paper contains figures illustrating circuit diagrams (Fig. 2, Fig. 3) and a process diagram (Fig. 1) but no explicitly labeled 'Pseudocode' or 'Algorithm' blocks with structured steps. |
| Open Source Code | Yes | Our code is available at https://github.com/martaskrt/qdeq. |
| Open Datasets | Yes | First, we consider MNIST-4, which consists of 4 classes of MNIST digits (0, 3, 6, 9) (Deng, 2012). Fashion MNIST (Fashion MNIST-10) (Xiao et al., 2017). Finally, we tested our setup on natural images with CIFAR-10 Krizhevsky et al. (2009). |
| Dataset Splits | Yes | For all datasets, we used default train/test splits2 and randomly split the training set into 80% train, 20% validation. |
| Hardware Specification | Yes | Runtime was calculated over 100 epochs on a NVIDIA RTX 2070 GPU. |
| Software Dependencies | No | As mentioned, all results were generated using the torchquantum framework (Wang et al., 2022a). This mentions a framework but lacks specific version numbers for it or any other software dependencies (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | We train the implicit models using a Broyden solver for at most 10 steps. For optimization, we use Adam (Kingma and Ba, 2014) and cross-entropy loss. We trained each model for 100 total epochs (i.e. if we first pre-trained using x warm-up epochs, we then trained using the implicit framework for 100 x epochs) (for CIFAR-10, we only trained for 25 total epochs since we found it to converge faster). We selected hyperparameters using the validation set; see Appendix E. Appendix E (Table 5) provides specific hyperparameter values. |