reproducibilityindex.ai

Sifting Common Information from Many Variables

Authors: Greg Ver Steeg, Shuyang Gao, Kyle Reing, Aram Galstyan

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The sieve outperforms standard methods on dimensionality reduction tasks, solves a blind source separation problem that cannot be solved with ICA, and accurately recovers structure in brain imaging data.
Researcher Affiliation	Academia	Greg Ver Steeg, Shuyang Gao, Kyle Reing, Aram Galstyan University of Southern California Information Sciences Institute gregv@isi.edu, gaos@usc.edu, reing@usc.edu, galstyan@isi.edu
Pseudocode	Yes	Alg. 1. Our implementation is available online [Ver Steeg, 2016]. [...] Algorithm 1: Algorithm to learn one layer of the sieve.
Open Source Code	Yes	Our implementation is available online [Ver Steeg, 2016]. [...] [Ver Steeg, 2016] Greg Ver Steeg. Linear information sieve code. http://github.com/gregversteeg/Linear Sieve, 2016.
Open Datasets	Yes	The two datasets we studied were GISETTE and MADELON and consist of 5000 and 500 dimensions respectively. [...] Interestingly, all three techniques peak at ﬁve dimensions, which was intended to be the correct number of latent factors embedded in this dataset [Guyon et al., 2004].
Dataset Splits	No	The paper mentions "Validation accuracy" in Figure 7 and states "We learn a low-dimensional representation on training data and then transform held-out test data and report the classiﬁcation accuracy on that." However, it does not provide specific details on how the validation dataset was created (e.g., percentages, sample counts, or explicit splitting methodology).
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU/GPU models, memory) used to run the experiments. It lacks concrete specifications.
Software Dependencies	No	The paper states "All methods were run using implementations in the scikit library [Pedregosa et al., 2011]", but it does not specify version numbers for scikit-learn or any other software dependencies.
Experiment Setup	Yes	Our ﬁxed point optimization requires us to start with some weights, w0 and we iteratively update wt using Eq. 6 until we reach a ﬁxed point. This only guarantees that we ﬁnd a local optima so we typically run the optimization 10 times and take the solution with the highest value of the objective. We initialize w0 i to be drawn from a normal with zero mean and scale 1/ p nσ2xi. [...] Convergence is determined by checking when changes in the objective of Eq. 5 fall below a certain threshold, 10 8 in our experiments. [...] For this experiment, we set total capacity to C = 4. By varying k, we are spreading this capacity across a larger number of noisier variables. [...] We set C = 1 for these experiments. [...] We learn 10 layers of the sieve and check how well Y1, . . . , Y10 recover the true sources. We also specify 10 components for the other methods shown for comparison.