Sequential Facility Location: Approximate Submodularity and Greedy Algorithm

Authors: Ehsan Elhamifar

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental By experiments on synthetic data and the problem of procedure learning from instructional videos, we show that our framework significantly improves the computational time, achieves better objective function values and obtains more coherent summaries.
Researcher Affiliation Academia 1Assistant Professor, Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA. Correspondence to: Ehsan Elhamifar <e.elhamifar@northeastern.edu>.
Pseudocode Yes Algorithm 1 : Greedy Maximization
Open Source Code No The paper does not explicitly state that source code for the methodology is provided or available, nor does it include any links to a code repository.
Open Datasets Yes We perform experiments on the Inria instructional video dataset (Alayrac et al., 2016) that consists of five tasks of change tire , make coffee , perform cpr , jump-start car and repot plant , with 30 videos per task.
Dataset Splits No The paper describes the synthetic data generation and mentions using the Inria instructional video dataset (Alayrac et al., 2016), but it does not specify explicit training, validation, and test splits (e.g., percentages, sample counts, or cross-validation setup) for reproducibility.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU models, or memory used for running its experiments.
Software Dependencies No The paper mentions using 'CVX' to solve an optimization problem, but it does not provide specific version numbers for this or any other software libraries, packages, or programming languages used in the experiments.
Experiment Setup Yes For the experiments, we set d = 300, M = 50, k = 15 and β = 0.01 in our method. and we segment each video using (Gygli et al., 2014) and, following (Alayrac et al., 2016), extract a 3000-dimensional feature vector from each segment, capturing appearance and motion, reduce the dimension of the data to d via PCA, hence, obtaining a time-series representation Yℓfor each video ℓ. We then learn, from all input videos, an HMM, whose states gathered in X correspond to different sub-activities across videos.