reproducibilityindex.ai

Unsupervised Learning of Deep Feature Representation for Clustering Egocentric Actions

Authors: Bharat Lal Bhatnagar, Suriya Singh, Chetan Arora, C.V. Jawahar

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our approach on four disparate public egocentric actions datasets amounting to approximately 50 hours of videos. We show that our approach surpasses the supervised state of the art accuracies without using the action labels.
Researcher Affiliation	Academia	Bharat Lal Bhatnagar* , Suriya Singh* , Chetan Arora+ , C.V. Jawahar* CVIT, KCIS, International Institute of Information Technology, Hyderabad* Indraprastha Institute of Information Technology, Delhi+
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any statement about making its source code available or provide a link to a code repository for the described methodology.
Open Datasets	Yes	We have used GTEA [Fathi et al., 2011b] and ADL-short [Singh et al., 2016a] for short term, hand-object coordinated videos, ADL-long [Pirsiavash and Ramanan, 2012] for long term hand-object coordinated videos and HUJIEGOSEG [Poleg et al., 2014] for long term videos without the handled objects.
Dataset Splits	No	The paper mentions 'During the train time, we randomly assign a set Nk of splices to each of the K autoencoders' but does not specify explicit train/validation/test splits with percentages or counts. It refers to evaluation metrics but not the data splitting methodology for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions using 'ADAM solver [Kingma and Ba, 2015]' and 'RMS-prop solver [Tieleman and Hinton, 2012]' for optimization, but it does not provide specific version numbers for these or any other software libraries or dependencies.
Experiment Setup	Yes	The encoder network consists of 2 convolutional layers and 2 fully connected layers, followed by the decoder network with 2 fully connected and 2 convolutional layers. We keep stride equal to 1 everywhere and use 2 2 max pooling. All the layers have a tanh activation function. ...we use 8 and 18 convolutional ﬁlters, each of size 5 5 in the ﬁrst two layers of the autoencoders. ...In our experiments we have kept K = 20. The LSTM autoencoder is a simple two layer architecture.