Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Synthesizing Aspect-Driven Recommendation Explanations from Reviews

Authors: Trung-Hoang Le, Hady W. Lauw

IJCAI 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on datasets of several product categories showcase the efﬁcacies of our method as compared to baselines based on templates, review summarization, selection, and text generation.
Researcher Affiliation	Academia	Trung-Hoang Le and Hady W. Lauw Singapore Management University, Singapore EMAIL
Pseudocode	Yes	Algorithm 1 SEER-Greedy; Algorithm 2 Opinion Substitution
Open Source Code	No	The paper links to third-party code implementations for EFM and MTER, but does not provide a link or explicit statement for the availability of their own SEER framework's source code.
Open Datasets	Yes	Experiments use four public datasets of Amazon reviews1 [Mc Auley et al., 2015] of varying categories: Computer and Accessories (Computer), Camera and Photo (Camera), Toys and Games (Toy), Cell Phones and Accessories (Cellphone). Preprocessing follows [Wang et al., 2018a]. For each category, we retrieve the most common aspects covering 90% of opinion phrases and ﬁlter out users and items with fewer than ﬁve reviews. The remaining are split into training, validation, and test at a ratio of 0.6 : 0.2 : 0.2 for every user chronologically. Sentences in validation and test with opinions or aspects that had not appeared in training were excluded. Table 2 shows some basic statistics of the datasets. 1http://jmcauley.ucsd.edu/data/amazon/
Dataset Splits	Yes	The remaining are split into training, validation, and test at a ratio of 0.6 : 0.2 : 0.2 for every user chronologically.
Hardware Specification	Yes	Experiments were run on machine with Intel Xeon E5-2650v4 2.20 GHz CPU and 256GB RAM.
Software Dependencies	No	The paper mentions the use of "CPLEX4 solver" but does not provide a specific version number. It does not list other key software components with version numbers required for replication.
Experiment Setup	Yes	For EFM2, as in the original work, the latent factor and explicit factor dimensions are 60 and 40. For MTER, we adopt the default setting of the author s implementation3. ... For ASC2V, we train with similar setting as C2V, using RMSprop for optimization.