Unsupervised Learning by Predicting Noise
Authors: Piotr Bojanowski, Armand Joulin
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test the quality of our features on several image classification problems, following the setting of Donahue et al. (2016). We are on par with state-of-the-art unsupervised and self-supervised learning approaches while being much simpler to train and to scale. |
| Researcher Affiliation | Industry | Piotr Bojanowski 1 Armand Joulin 1 1Facebook AI Research. Correspondence to: Piotr Bojanowski <bojanowski@fb.com>. |
| Pseudocode | Yes | Algorithm 1 Stochastic optimization of Eq. (5). Require: T batches of images, λ0 > 0 for t = {1, . . . , T} do Obtain batch b and representations r Compute fθ(Xb) Compute P by minimizing Eq. (2) w.r.t. P Compute θL(θ) from Eq. (2) with P Update θ θ λt θL(θ) end for |
| Open Source Code | No | The paper does not provide a direct statement about open-sourcing the code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | We use the training set of Image Net to learn our convolutional network (Deng et al., 2009). This dataset is composed of 1, 281, 167 images that belong to 1, 000 object categories. |
| Dataset Splits | Yes | We report the accuracy on the validation set. We report the accuracy on the validation set of Image Net. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions general techniques and architectures like 'SGD with batch normalization' and 'Alex Net', but does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | The network is trained with SGD with a batch size of 256. During the first t0 batches, we use a constant step size. After t0 batches, we use a linear decay of the step size, i.e., lt = l0 1+γ[t t0]+ . Unless mentioned otherwise, we permute the assignments within batches every 3 epochs. |