Unsupervised Learning of Deep Feature Representation for Clustering Egocentric Actions
Authors: Bharat Lal Bhatnagar, Suriya Singh, Chetan Arora, C.V. Jawahar
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our approach on four disparate public egocentric actions datasets amounting to approximately 50 hours of videos. We show that our approach surpasses the supervised state of the art accuracies without using the action labels. |
| Researcher Affiliation | Academia | Bharat Lal Bhatnagar* , Suriya Singh* , Chetan Arora+ , C.V. Jawahar* CVIT, KCIS, International Institute of Information Technology, Hyderabad* Indraprastha Institute of Information Technology, Delhi+ |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statement about making its source code available or provide a link to a code repository for the described methodology. |
| Open Datasets | Yes | We have used GTEA [Fathi et al., 2011b] and ADL-short [Singh et al., 2016a] for short term, hand-object coordinated videos, ADL-long [Pirsiavash and Ramanan, 2012] for long term hand-object coordinated videos and HUJIEGOSEG [Poleg et al., 2014] for long term videos without the handled objects. |
| Dataset Splits | No | The paper mentions 'During the train time, we randomly assign a set Nk of splices to each of the K autoencoders' but does not specify explicit train/validation/test splits with percentages or counts. It refers to evaluation metrics but not the data splitting methodology for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'ADAM solver [Kingma and Ba, 2015]' and 'RMS-prop solver [Tieleman and Hinton, 2012]' for optimization, but it does not provide specific version numbers for these or any other software libraries or dependencies. |
| Experiment Setup | Yes | The encoder network consists of 2 convolutional layers and 2 fully connected layers, followed by the decoder network with 2 fully connected and 2 convolutional layers. We keep stride equal to 1 everywhere and use 2 2 max pooling. All the layers have a tanh activation function. ...we use 8 and 18 convolutional filters, each of size 5 5 in the first two layers of the autoencoders. ...In our experiments we have kept K = 20. The LSTM autoencoder is a simple two layer architecture. |