Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

VisualLens: Personalization through Task-Agnostic Visual History

Authors: Wang Bill Zhu, Deqing Fu, Kai Sun, Yi Lu, Zhaojiang Lin, Seungwhan Moon, Kanika Narang, MUSTAFA CANIM, Yue Liu, Anuj Kumar, Xin Dong

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental study shows promising recommendation quality of Visual Lens. It achieved 82-91% Hit@10 on the Google Review-V and Yelp-V benchmarks, outperforming state-of-the-art (Uni MP [46]) by 10%. Even comparing with GPT-4o, our 8B model improves Hit@3 by 1.6% and 4.6% respectively on the two benchmarks. Further analysis reveals that Visual Lens excels in adapting to longer histories and unseen categories, while maintaining robustness with shorter histories.
Researcher Affiliation Collaboration Correspondence: EMAIL Work done at Meta.
Pseudocode Yes We list the complete candidate and ground truth sets generation algorithm in Algorithm 1, where the function nearest(G, loc, m) returns the m nearest businesses of around certain location loc based on the graph G.
Open Source Code No The validation and test sets of Google Review-V and Yelp-V is provided as supplementary, but not the code.
Open Datasets Yes We created two new benchmarks, Google Review-V and Yelp-V, leveraging publicly available data from Google Local Review [24] and Yelp [2].
Dataset Splits Yes Table 2: Dataset statistics of Google Review-V and Yelp-V. Dataset Train Dev Test Categories Avg. # of images Avg. # of GT Avg. # of candidates GR-V 15.69M 2K 200K 66 157.0 2.7 43.1 Yelp-V 4.12M 2K 100K 35 263.6 8.2 66.7
Hardware Specification Yes GPU 8 NVIDIA H100
Software Dependencies No The paper mentions specific models (Pali Gemma, Mini CPM-V2.5, LLa VA-v1.6-8B, CLIP Vi T-L/14@336px, Llama-3.1 70B) but does not provide specific version numbers for underlying software libraries or programming languages like Python, PyTorch, or CUDA, as required by the guidelines.
Experiment Setup Yes Table 9: Hyperparameters for training Visual Lens with Pali Gemma Backbone. Hyperparameters for training on Pali Gemma Parameter Size 3B Image Resolution 896 896 Number of Image Tokens 4096 Hidden Dimension Size 2048 Lo RA Rank 16 Lo RA α 16 Lo RA dropout 0.1 GPU 8 NVIDIA H100 Batch Size 8 Gradient Accumulation Steps 8 Warmup Steps 200 Learning Rate 1e-3 Table 10: Hyperparameters for training Visual Lens with Mini CPM-V2.5 Backbone. Hyperparameters for training on Mini CPM-V2.5 Parameter Size 8B Image Resolution 980 980 Number of Image Tokens 96 Hidden Dimension Size 4096 Lo RA Rank 64 Lo RA α 64 Lo RA dropout 0.1 GPU 8 NVIDIA H100 Batch Size 8 Gradient Accumulation Steps 8 Warmup Steps 200 Learning Rate 1e-3