Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

UV-Attack: Physical-World Adversarial Attacks on Person Detection via Dynamic-NeRF-based UV Mapping

Authors: Yanjie Li, Kaisheng Liang, Bin Xiao

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that UV-Attack achieves a 92.7% attack success rate against the Fast RCNN model across varied poses in dynamic video settings, significantly outperforming the state-of-the-art Adv Camou attack, which only had a 28.5% ASR. Moreover, we achieve 49.5% ASR on the latest YOLOv8 detector in black-box settings. ... Our experiments demonstrate the superiority of our approach in terms of evasion success rates in free-pose scenarios. As shown in Figure 1, we physically printed the Bohemian-style clothing and successfully fooled person detectors in dynamic video settings, even when the subjects were in continuous and significant movement.
Researcher Affiliation Academia Yanjie Li, Kaisheng Liang, Bin Xiao Department of Computer Science, Hong Kong Polytechnic University EMAIL EMAIL EMAIL
Pseudocode Yes Algorithm 1 The UV-Attack
Open Source Code Yes The code is available at https://github.com/Poly Li YJ/UV-Attack.
Open Datasets Yes We evaluate the success rate of digital attacks on the ZJU-Mocap dataset (Peng et al., 2021). For the physical attack, we collected videos of five individuals and used SPIN and Dense Pose to extract SMPL parameters and pseudo-supervised IUV maps for training the UVVolumes. ... an annotated pose dataset MPII (Andriluka et al., 2014). ... the COCO dataset ... a new indoor dataset (Quattoni & Torralba, 2009)
Dataset Splits Yes To evaluate the ASR of different attacks under unseen pose scenarios, we randomly sample 1000 poses from the GMM and combine them with different backgrounds and camera and light conditions to construct an unseen pose dataset. ... We collected 100 different backgrounds from both indoor and outdoor scenarios. ... The camera is sampled from azim [ 180 , 180 ] and elev [0 , 30 ]. ... For each pose, we capture 10 images from different view angles.
Hardware Specification Yes The model training and attack implementation are conducted on a single Nvidia 3090 GPU.
Software Dependencies No The paper mentions software like "pretrained latent diffusion model Ο•", "pre-trained stable diffusion model (Rombach et al., 2022)", "SPIN model (Kolotouros et al., 2019)", "Dense Pose", "Fast RCNN (Girshick, 2015)", "YOLOv3 (Redmon & Farhadi, 2018)", "MMDET toolbox (Chen et al., 2019)", and "Adam optimizer". However, specific version numbers for general software dependencies like Python, PyTorch, or CUDA are not provided.
Experiment Setup Yes The diffusion process was set to 10 steps. The Particle Swarm Optimization (PSO) ran for 30 epochs with 50 swarms, and the Adam optimizer ran for 300 epochs. We collected 100 different backgrounds from both indoor and outdoor scenarios. For each epoch, we randomly sampled 100 poses, camera and light positions, and backgrounds for training. The camera is sampled from azim [ 180 , 180 ] and elev [0 , 30 ]. ... We set the finetuning epoch as 100 in the physical attacks. ... The confidence threshold Ο„conf is set as 0.5. ... The Io U threshold is set as 0.1 and the confidence threshold is set as 0.5.