Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Proximal Mapping Loss: Understanding Loss Functions in Crowd Counting & Localization
Authors: Wei LIN, Jia Wan, Antoni Chan
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimentally, PML significantly improves the performance of crowd counting and localization, and illustrates the robustness against annotation noise. The code is available at https://github.com/Elin24/pml. 5 EXPERIMENTS In this section, we present experiments demonstrating the efficacy of our proposed PML in crowd counting and crowd localization. |
| Researcher Affiliation | Academia | Wei Lin1, Jia Wan2 & Antoni B. Chan1 1Department of Computer Science, City University of Hong Kong, Hong Kong SAR, 2School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes methods and formulas (e.g., L(A, B) = Xm j=1 L( Aj, bj), Aj = {(ai, xi)}i Xj bj = (1, yj)), but does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor structured steps formatted like code. |
| Open Source Code | Yes | Experimentally, PML significantly improves the performance of crowd counting and localization, and illustrates the robustness against annotation noise. The code is available at https://github.com/Elin24/pml. |
| Open Datasets | Yes | PML attains the best performance on Shanghai Tech Part B (Zhang et al., 2016) with an MAE of 5.4 and MSE of 8.2. On the UCF-QNRF dataset (Idrees et al., 2018), the estimation errors (MAE: 73.2, MSE: 127.5) are superior to the latest records set by STEERER (Han et al., 2023) (MAE: 74.3; MSE: 128.3). Additionally, on the larger NWPU benchmark (Wang et al., 2020c), PML demonstrates outstanding performance with an MAE of 63.8 and MSE of 306.9, showcasing its excellent counting performance on large-scale datasets. Note that due to the small size of Sh Tech A, VGG19 usually overfits and does not achieve good performance, while VGGG16-bn does not overfit and obtains MAE of 50.6 and MSE of 80.7, comparable to PET. |
| Dataset Splits | No | The paper mentions using several benchmark datasets (e.g., Shanghai Tech Part B, UCF-QNRF, NWPU benchmark, Sh Tech A), and refers to a "NWPU validation set", but does not explicitly provide specific details about how these datasets were split into training, validation, or test sets (e.g., percentages or absolute counts) for their experiments. |
| Hardware Specification | Yes | In practice, the average loss computation time ratio is GL PML = 0.045s / 0.013s 3.46 on a 3090Ti GPU, with the density map resolution of 256 256. |
| Software Dependencies | No | The paper mentions using VGG19 and HRNet as backbones for the counting model, but does not provide specific version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used in the implementation. |
| Experiment Setup | Yes | In our experiments, nearest neighbor is employed in the divide stage to associate pixels in the density map with its nearest point in the ground truth. During training, the dynamic L2 loss with L1-norm, as described in (14), is utilized to supervise the counting model, showing superior performance. Additionally, a hyperparameter ϵ is introduced to balance these two terms in PML. Therefore, the final loss is formulated by incorporating (14) into (1) and add ϵ as the weight: L(A, B) = a C1 + ϵ P a 1 1, ... The counting model utilizes VGG19 (Simonyan & Zisserman, 2015) and HRNet (Wang et al., 2020b) as the backbone. Details and related ablation study on the structure are illustrated in the appendix. In (24), a balanced hyperparameter ϵ is introduced to weigh the two components, enabling a focus on either localization or counting. Fig. 10(a) shows the impact of ϵ on the counting performance of the trained model. A value of ϵ = 1 yields a good counting model, but surprisingly, ϵ = 2 results in even lower estimation errors. |