Synbols: Probing Learning Algorithms with Synthetic Datasets
Authors: Alexandre Lacoste, Pau Rodríguez López, Frederic Branchaud-Charron, Parmida Atighehchian, Massimo Caccia, Issam Hadj Laradji, Alexandre Drouin, Matthew Craddock, Laurent Charlin, David Vázquez
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments probing the behavior of popular learning algorithms in various machinelearning settings including: the robustness of supervised learning and unsupervised representation-learning approaches w.r.t. changes in latent-data attributes ( 3.1 and 3.4) and to particular out-of-distribution patterns ( 3.2), the efficacy of different strategies and uncertainty calibration in active learning ( 3.3), and the effect of training losses for object counting ( 3.5). |
| Researcher Affiliation | Collaboration | 1Element AI {allac, pau.rodriguez, frederic.branchaud-charron, parmida, massimo.caccia, issam.laradji, adrouin, matt.craddock, dvazquez}@elementai.com 2Mila, Université de Montréal {massimo.p.caccia, lcharlin}@gmail.com |
| Pseudocode | No | The paper includes Python code snippets for defining dataset attributes, but no formally labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | We introduce Synbols2, an easy to use dataset generator with a rich composition of latent features for lower-resolution images. 2https://github.com/Element AI/synbols |
| Open Datasets | Yes | We introduce Synbols2, an easy to use dataset generator with a rich composition of latent features for lower-resolution images. 2https://github.com/Element AI/synbols |
| Dataset Splits | Yes | All results are obtained using a (train, valid, test) partition of size ratio (60%, 20%, 20%). |
| Hardware Specification | Yes | The total training time on datasets of size 100k is about 3 minutes for most models (including Res Net-12) on a Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions 'Pycairo, a 2D vector graphics library' and 'Adam [22] is used to train all models,' but no specific version numbers for software dependencies are provided. |
| Experiment Setup | Yes | All results are obtained using a (train, valid, test) partition of size ratio (60%, 20%, 20%). Adam [22] is used to train all models, and the learning rate is selected using a validation set. Resnet12+ and WRN+ were trained with data augmentation consisting of random rotations, translation, shear, scaling, and color jitter. |