Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
STAIR: Manipulating Collaborative and Multimodal Information for E-Commerce Recommendation
Authors: Cong Xu, Yunhang He, Jun Wang, Wei Zhang
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments validate the superiority of STAIR in recommendation accuracy and efficiency. Note that although STAIR achieves state-of-the-art performance in e-commerce multimodal recommendation, it may not fully mine the raw multimodal features in contentdriven scenarios such as news and video recommendation (Wu et al. 2020). |
| Researcher Affiliation | Academia | East China Normal University Shanghai, China EMAIL |
| Pseudocode | Yes | Algorithm 1: STAIR training procedures. |
| Open Source Code | Yes | Code https://github.com/yhhe2004/STAIR |
| Open Datasets | Yes | We consider in this paper three commonly used e-commerce datasets obtained from Amazon reviews, including Baby, Sports, and Electronics. As suggested by (Zhang et al. 2021; Zhou and Shen 2023), we filter out users and items with less than 5 interactions, and Table 1 presents the dataset statistics after preprocessing. Each dataset contains item thumbnails and text descriptions (e.g., title, brand). Following (Zhou et al. 2023), the 4,096-dimensional visual features published in (Ni, Li, and Mc Auley 2019), and the 384-dimensional sentence embeddings published in (Zhou 2023) are used for experiments. |
| Dataset Splits | No | The paper mentions evaluating on a 'test set' and using 'validation NDCG@20 metric' for hyperparameter tuning, implying the existence of these splits. However, it does not provide explicit percentages, sample counts, or a detailed methodology for how the training, validation, and test sets were created or partitioned. |
| Hardware Specification | Yes | When dealing with larger datasets such as Electronics, the computational and memory requirements make MMSSL impossible to implement in realworld recommendations. [...] indicates that the method cannot be performed with a RTX 3090 GPU. |
| Software Dependencies | No | Adam W is employed as the optimizer for training STAIR, whose learning rate is searched from {1e-4, 5e-4, 1e-3, 5e-3} and the weight decay in the range of [0, 1]. While Adam W is mentioned, no specific version numbers for any software, libraries, or programming languages are provided. |
| Experiment Setup | Yes | For fairness (Zhang et al. 2021; Zhou and Shen 2023), we fix the embedding dimension to 64 and the number of convolutional layers to 3 for both FSC and BSC processes. As suggested in (Xu et al. 2024), Adam W is employed as the optimizer for training STAIR, whose learning rate is searched from {1e-4, 5e-4, 1e-3, 5e-3} and the weight decay in the range of [0, 1]. The exponent γ that controls the changing rate of the layer weights can be chosen from {0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5}. In addition, we adjust the number of neighbors km from {1, 3, 5, 10, 20} for each modality m M separately. For all methods, we report the results on the best checkpoint identified by the validation NDCG@20 metric over 500 epochs. |