reproducibilityindex.ai

Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs

Authors: Denis Mazur, Vage Egiazarian, Stanislav Morozov, Artem Babenko

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conﬁrm the superiority of our method via extensive experiments on a wide range of tasks, including classiﬁcation, compression, and collaborative ﬁltering. ... Via extensive experiments on several different tasks, we conﬁrm that, in terms of memory consumption, PRODIGE is more efﬁcient than its vectorial counterparts.
Researcher Affiliation	Collaboration	Denis Mazur Yandex denismazur@yandex-team.ru Vage Egiazarian Skoltech Vage.egiazarian@skoltech.ru Stanislav Morozov Yandex Lomonosov Moscow State University stanis-morozov@yandex.ru Artem Babenko Yandex National Research University Higher School of Economics artem.babenko@phystech.edu
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks. The method is described in narrative text and mathematical formulations.
Open Source Code	Yes	3. The Py Torch source code of PRODIGE is available online1. 1https://github.com/stanis-morozov/prodige
Open Datasets	Yes	We experiment with three publicly available datasets: MNIST10k, GLOVE10k, Celeb A10k ... All experiments are performed on the Pinterest dataset[31]. ... We evaluate our model on the IMDB benchmark [33], a popular dataset for text sentiment binary classiﬁcation.
Dataset Splits	No	For the IMDB dataset, the paper states: "The data is split into training and test sets, each containing N=25, 000 text instances." However, it does not explicitly mention a validation set split for any of the datasets used, nor does it provide specific percentages or counts for all train/test/validation splits for all datasets.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions using "Tensor Flow or Py Torch" for autograd, "Sparse Adam" as an optimizer, "gensim model" and the "Implicit package" for baselines. However, it does not provide specific version numbers for any of these software dependencies.
Experiment Setup	Yes	We tune the regularization coefﬁcient λ to achieve the overall memory consumption close to the considered operating points. ... Namely, we start with 64 edges per vertex, half of which are links to the nearest neighbors and the other half are random edges. ... we restrict a set of possible edges to include 16 user-user and item-item edges and all relevant user-item edges available in the training data; ... one-dimensional convolutional layer with 32 output ﬁlters, followed by a global max pooling layer, a Re LU nonlinearity and a ﬁnal dense layer that predicts class logits.