Learnable Visual Markers

Authors: Oleg Grinchuk, Vadim Lebedev, Victor Lempitsky

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In the experiments, we demonstrate that the markers obtained using our approach are capable of retaining bit strings that are long enough to be practical. Below, we present qualitative and quantitative evaluation of our approach.
Researcher Affiliation Collaboration 1Skolkovo Institute of Science and Technology, Moscow, Russia 2Yandex, Moscow, Russia
Pseudocode No The paper describes the architecture and implementation details in text but does not provide structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository.
Open Datasets No The paper mentions using a 'random pool of images' for background patches and a 'pretrained' VGGNet trained on a 'large-scale dataset', but does not provide specific access information or a formal citation for the datasets used for training in their experiments.
Dataset Splits No The paper refers to 'training stage' and 'accuracy achieved during training' but does not specify a distinct validation set or detailed train/validation/test splits.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models used for running experiments.
Software Dependencies No The paper mentions various deep learning components and architectures (e.g., ADAM, Conv Nets, VGGNet) and cites relevant papers, but does not specify software versions or library dependencies (e.g., PyTorch version, TensorFlow version).
Experiment Setup Yes For the experiments without texture loss, we use the simplest synthesizer network, which consists of a single linear layer (with a 3m2 n matrix and a bias vector) that is followed by an element-wise sigmoid. Unless reported otherwise, the recognizer network was implemented as a Conv Net with three convolutional layers (96 5 5 filters followed by max-pooling and Re LU), and two fully-connected layer with 192 and n output units respectively. We perform a spatial transform as an affine transformation, where the 6 affine parameters are sampled from [1, 0, 0, 0, 1, 0]+N(0, σ) (assuming origin at the center of the marker). The example for σ = 0.1 is shown in Fig. 2. Given an image x, we implement the color transformation layer as c1xc2 +c3, where the parameters are sampled from the uniform distribution U[ δ, δ].