FreqMark: Invisible Image Watermarking via Frequency Based Optimization in Latent Space
Authors: YiYang Guo, Ruizhe Li, Mude Hui, Hanzhong Guo, Chen Zhang, Chuangjian Cai, Le Wan, shangfei wang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that Freq Mark offers significant advantages in image quality and robustness, permits flexible selection of the encoding bit number, and achieves a bit accuracy exceeding 90% when encoding a 48-bit hidden message under various attack scenarios. |
| Researcher Affiliation | Collaboration | Yiyang Guo 1,5 , Ruizhe Li 2, Mude Hui3, Hanzhong Guo4, Chen Zhang1 Chuangjian Cai5, Le Wan5, Shangfei Wang 1 1University of Science and Technology of China 2Fudan University 3University of California, Santa Cruz 4The University of Hong Kong 5IEG, Tencent |
| Pseudocode | No | The paper does not contain any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | We will declutter and release the code in the future. We have provided sufficient details for the replication of the paper in Appendix A.1. |
| Open Datasets | Yes | Datasets A test dataset is compiled, consisting of 500 images randomly selected from the Image Net validation set [16], in conjunction with 500 images generated using Stable Diffusion [42] based on prompts from the Diffusion DB [49] dataset. |
| Dataset Splits | No | A test dataset is compiled, consisting of 500 images randomly selected from the Image Net validation set [16], in conjunction with 500 images generated using Stable Diffusion [42] based on prompts from the Diffusion DB [49] dataset. |
| Hardware Specification | Yes | Compute Resources All experiments could be conducted on a single A-100 GPU with 40GB memory. |
| Software Dependencies | No | The paper mentions specific models like 'Stable Diffusion 2-1' and 'DINO v2' and 'Adam optimizer' but does not provide specific version numbers for general software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | Hyperparameters The KL auto-encoder from Stable Diffusion 2-1 [42] is utilized. Due to the significant reconstruction loss associated with low-resolution images, the images are upscaled to 512 512 for processing. In the watermark addition stage, the Adam optimizer is used with a learning rate of 2.0 and training for 400 steps. We set the PSNR loss weight λp to 0.05 and the LPIPS loss weight λi to 0.25. To encode the watermark, the first 128 dimensions of the output feature generated by the Dino v2 small image encoder [37] are utilized. In the experiments, we set the directional vectors as a set of 48 vectors, where the i-th vector has a value of 1 in its i-th dimension and 0 for the remaining dimensions. In addition, during the training phase, two types of spatial transformations and pixel noise are selected with equal probability. For rotation augmentation, the rotation angle is randomly chosen in 90-degree increments. The crop augmentation is set with a crop scale range of [0.2, 1.0] and a crop ratio range of [3/4, 4/3]. |