reproducibilityindex.ai

Long-Term Image Boundary Prediction

Authors: Apratim Bhattacharyya, Mateusz Malinowski, Bernt Schiele, Mario Fritz

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our CMSC model on natural video sequences involving agent-based motion and billiard sequences with only physics-based motion. We compare with various baselines and perform ablation studies to conﬁrm design choices.
Researcher Affiliation	Academia	Apratim Bhattacharyya, Mateusz Malinowski, Bernt Schiele, Mario Fritz Max Planck Institute for Informatics Saarland Informatics Campus, Saarbr ucken, Germany {abhattac, mmalinow, schiele, mfritz}@mpi-inf.mpg.de
Pseudocode	No	No pseudocode or algorithm blocks were found in the paper. The model architecture and components are described in text and through diagrams.
Open Source Code	No	No explicit statement about the release or availability of source code for the described methodology was found.
Open Datasets	Yes	We use the VSB100 dataset which contains 101 videos with a maximum 121 frames each. The training set consists of 40 videos and the test set consists of 60 videos. ... Similarly we randomly select 1000, 500 (training) and 1000 (test) videos from UCF101.
Dataset Splits	No	The paper specifies training and test sets but does not explicitly mention a separate validation set or its split details.
Hardware Specification	Yes	On the Nvidia Titan X GPU, our CMSC model takes approximately 16 hours to train on the VSB100 and real billiards datasets and 10 hours on synthetic billiards (1 ball) dataset.
Software Dependencies	No	The paper mentions the use of "pygame" for synthetic data generation and the "ADAM optimizer", but does not provide specific version numbers for these or any other software dependencies, libraries, or frameworks used.
Experiment Setup	Yes	We use L2 loss (mean square error) during training, which we optimize using the ADAM optimizer. ... We convert each video into 32 32 pixel patches. The CMSC model observes a central patch and eight neighbouring patches resulting in a context of size 96 96 pixels. ... We use four levels, with scales increasing by a factor of two. ... Each level of the model consists of ﬁve sets of two convolutional layers. There are 32, 64, 128, 64 and 32 ﬁlters respectively in each set, of a constant size 3 3. ... We introduce moderate 2 2 pooling layer after the ﬁrst two sets of convolutional layers... We use Re LU non-linearities between every layer expect the last. We use the tanh nonlinearity at the end to ensure output in the range [0,1]. ... To deal with deceleration, we experiment with increasing the number of input frames. We train our CMSC model with six input frames and pre-train on our synthetic one ball training set.