Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Leveraging RGB-D Data with Cross-Modal Context Mining for Glass Surface Detection

Authors: Jiaying Lin, Yuen-Hei Yeung, Shuquan Ye, Rynson W. H. Lau

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that our proposed model outperforms state-of-the-art methods. Extensive experiments are conducted to evaluate our method, in comparison with the state-of-the-art methods from relevant tasks. We evaluate our method in two sets of experiments: one using our RGB-D dataset and the other using existing RGB glass datasets such as GDD and GSD. We use four metrics for evaluation, including intersection-over-union (Io U), F-measure (Fω), mean absolute error (MAE), and balance error rate (BER).
Researcher Affiliation Academia Jiaying Lin*, Yuen-Hei Yeung*, Shuquan Ye , Rynson W.H. Lau City University of Hong Kong
Pseudocode No The paper describes the proposed framework and modules using text and diagrams (Figure 4, 5, 6), but there are no explicitly labeled pseudocode blocks or algorithms.
Open Source Code Yes Code, Dataset and Extended version https://jiaying.link/AAAI25-RGBDGlass/
Open Datasets Yes In this work, we first propose a large-scale RGB-D glass surface detection dataset, RGB-D GSD, for rigorous experiments and future research. It contains 3,009 images, paired with precise annotations, offering a wide range of real-world RGB-D glass surface categories. We then propose a novel glass surface detection framework combining RGB and depth information, with two novel modules: a cross-modal context mining (CCM) module to adaptively learn individual and mutual context features from RGB and depth information, and a depth-missing aware attention (DAA) module to explicitly exploit spatial locations where missing depths occur to help detect the presence of glass surfaces. Experimental results show that our proposed model outperforms state-of-the-art methods. Code, Dataset and Extended version https://jiaying.link/AAAI25-RGBDGlass/. It is an attentively curated ensemble of three existing datasets originally developed for scene understanding, including SUN RGB-D (Armeni et al. 2017), 2D-3D-Semantics (Song, Lichtenberg, and Xiao 2015) and Matterport3D (Chang et al. 2017).
Dataset Splits Yes This dataset contains a total of 3,009 images, where 2,400 for training and 609 for testing.
Hardware Specification Yes Our model takes about 14 hours to converge, and 0.10s per image for inference on a single RTX2080Ti.
Software Dependencies No The paper mentions using "ResNext-101" and "Adam optimizer" but does not specify version numbers for these or other software libraries/frameworks (e.g., Python, PyTorch, CUDA).
Experiment Setup Yes We use the Adam optimzier (Kingma and Ba 2015) with an initial learning rate of 1e 4, training epoch 130 and batch size 14.