Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Continual Learning of a Mixed Sequence of Similar and Dissimilar Tasks
Authors: Zixuan Ke, Bing Liu, Xingchang Huang
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluation using sequences of mixed tasks demonstrates the effectiveness of the proposed model. |
| Researcher Affiliation | Academia | Zixuan Ke1, Bing Liu1, and Xingchang Huang2 1 Department of Computer Science, University of Illinois at Chicago 2 ETH Zurich EMAIL, EMAIL |
| Pseudocode | No | The paper describes the model and methods using text and mathematical equations (e.g., Equations 1-11) and diagrams (Figure 1), but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | 2https://github.com/Zixuan Ke/CAT |
| Open Datasets | Yes | We adopt two similar-task datasets from federated learning... from two publicly available federated learning datasets (Caldas et al., 2018)... EMNIST (Le Cun et al., 1998) and CIFAR100 (Krizhevsky et al., 2009). |
| Dataset Splits | Yes | We further split about 10% of the original training data as the validate data. |
| Hardware Specification | No | The paper describes network architectures (e.g., '2-layer fully connected network', 'CNN based Alex Net-like architecture') and training details, but it does not specify any particular hardware components like GPU or CPU models used for the experiments. |
| Software Dependencies | No | The paper mentions using SGD for training and describes network architectures, but it does not provide specific version numbers for software dependencies like programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA). |
| Experiment Setup | Yes | We use 140 for smax in s, dropout of 0.5 between fully connected layers... We set the number of attention heads to 5... We train all models using SGD with the learning rate of 0.05... The batch size is set to 64. |