Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
UV-Attack: Physical-World Adversarial Attacks on Person Detection via Dynamic-NeRF-based UV Mapping
Authors: Yanjie Li, Kaisheng Liang, Bin Xiao
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that UV-Attack achieves a 92.7% attack success rate against the Fast RCNN model across varied poses in dynamic video settings, significantly outperforming the state-of-the-art Adv Camou attack, which only had a 28.5% ASR. Moreover, we achieve 49.5% ASR on the latest YOLOv8 detector in black-box settings. ... Our experiments demonstrate the superiority of our approach in terms of evasion success rates in free-pose scenarios. As shown in Figure 1, we physically printed the Bohemian-style clothing and successfully fooled person detectors in dynamic video settings, even when the subjects were in continuous and significant movement. |
| Researcher Affiliation | Academia | Yanjie Li, Kaisheng Liang, Bin Xiao Department of Computer Science, Hong Kong Polytechnic University EMAIL EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1 The UV-Attack |
| Open Source Code | Yes | The code is available at https://github.com/Poly Li YJ/UV-Attack. |
| Open Datasets | Yes | We evaluate the success rate of digital attacks on the ZJU-Mocap dataset (Peng et al., 2021). For the physical attack, we collected videos of five individuals and used SPIN and Dense Pose to extract SMPL parameters and pseudo-supervised IUV maps for training the UVVolumes. ... an annotated pose dataset MPII (Andriluka et al., 2014). ... the COCO dataset ... a new indoor dataset (Quattoni & Torralba, 2009) |
| Dataset Splits | Yes | To evaluate the ASR of different attacks under unseen pose scenarios, we randomly sample 1000 poses from the GMM and combine them with different backgrounds and camera and light conditions to construct an unseen pose dataset. ... We collected 100 different backgrounds from both indoor and outdoor scenarios. ... The camera is sampled from azim [ 180 , 180 ] and elev [0 , 30 ]. ... For each pose, we capture 10 images from different view angles. |
| Hardware Specification | Yes | The model training and attack implementation are conducted on a single Nvidia 3090 GPU. |
| Software Dependencies | No | The paper mentions software like "pretrained latent diffusion model Ο", "pre-trained stable diffusion model (Rombach et al., 2022)", "SPIN model (Kolotouros et al., 2019)", "Dense Pose", "Fast RCNN (Girshick, 2015)", "YOLOv3 (Redmon & Farhadi, 2018)", "MMDET toolbox (Chen et al., 2019)", and "Adam optimizer". However, specific version numbers for general software dependencies like Python, PyTorch, or CUDA are not provided. |
| Experiment Setup | Yes | The diffusion process was set to 10 steps. The Particle Swarm Optimization (PSO) ran for 30 epochs with 50 swarms, and the Adam optimizer ran for 300 epochs. We collected 100 different backgrounds from both indoor and outdoor scenarios. For each epoch, we randomly sampled 100 poses, camera and light positions, and backgrounds for training. The camera is sampled from azim [ 180 , 180 ] and elev [0 , 30 ]. ... We set the finetuning epoch as 100 in the physical attacks. ... The confidence threshold Οconf is set as 0.5. ... The Io U threshold is set as 0.1 and the confidence threshold is set as 0.5. |