Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Embodied Crowd Counting

Authors: Runling Long, Yunlong Wang, Jia Wan, Xiang Deng, Xinting Zhu, Weili Guan, Antoni B. Chan, Liqiang Nie

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that the proposed method achieves the best trade-off between counting accuracy and navigation cost. ... 4 Experiments Baselines. We compare ZECC with exploration methods: Frontier-based exploration (FBE) [45], ZSON methods: Co W [9] and Open FMNav [24], and multi-view counting (MVC) methods [56, 34]. Metrics. Mean Absolute Percentage Error (MAPE) is used to evaluate the counting performance: MAPE = 1 M PM i=1 yi ˆyi 100% where M is the quantity of testing environments, ˆyi and yi are the estimated count and the ground truth count, respectively. For ZSON methods, the sum of Euclidean distance between adjacent navigation points along the agent traveling path is used to evaluate the travel distance (TD), which is defined as: TD = n 1 P i=1 xi+1 xi , where n is the quantity of navigation points in one episode, xi and xi+1 are the coordinates of the two adjacent navigation points, respectively. For the multi-view crowd counting method, we report the number of cameras required to achieve comparable performance to the proposed approach.
Researcher Affiliation	Academia	1Harbin Institute of Technology, Shenzhen 2City University of Hong Kong EMAIL EMAIL EMAIL {xt.zhu}@my.cityu.edu.hk {abchan}@cityu.edu.hk
Pseudocode	Yes	Algorithm 1 Pseudo-Code of ATE Require: Global distribution D, HAE map MH, LAE map ML, Prompt I Ensure: ... Algorithm 2 Pseudo-Code of NLBN Require: Cluster size ϵ, Navigation vector degree ζ, Navigation point range η, Density map threshold κ, Global distribution D Ensure:
Open Source Code	Yes	Code can be found at https://github.com/longrunling/ECC?.
Open Datasets	Yes	Given the absence of an existing dataset, we have created a new dataset called the Embodied Crowd Counting Dataset (ECCD) specifically for this task.
Dataset Splits	No	The paper mentions 'M is the quantity of testing environments', implying a test set, but it does not specify exact percentages, sample counts, or a detailed methodology for splitting the dataset into training, validation, and test sets. It only states that ECCD contains '60 distinct environments'.
Hardware Specification	Yes	All methods are proceeded on a platform with Intel Corei9-14900KF, 128GBRAM, and NVIDIA Ge Force RTX 4090 GPU.
Software Dependencies	No	The paper mentions 'MLLM used in ATE is GPT-4V [53]' and 'The method used to estimate the normal line of the crowd cluster plane is the Open3D package.' While GPT-4V is a specific model, it is not a software library or solver in the traditional sense, and Open3D is mentioned without a specific version number. No other key software components with specific version numbers (e.g., Python, PyTorch, CUDA) are provided.
Experiment Setup	Yes	Altitude is 80m for HAE and 10m for LAE. The crowd density estimator in ATE is Generalized Loss (GL) [47], and the detection model in FDC is Grounding DINO (GD) [39]. For hyper parameters, navigation vector deg ζ is 15 , density threshold κ is 0.7, navigation point range η is 8 m, and cluster size ϵ is 40.