reproducibilityindex.ai

House of Cans: Covert Transmission of Internal Datasets via Capacity-Aware Neuron Steganography

Authors: Xudong Pan, Shengyao Zhang, Mi Zhang, Yifan Yan, Min Yang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive evaluation shows, Cans is the first working scheme which can covertly transmit over 10000 real-world data samples within a carrier model which has 220 less parameters than the total size of the stolen data, and simultaneously transmit multiple heterogeneous datasets within a single carrier model, under a trivial distortion rate (< 10 5) and with almost no utility loss on the carrier model (< 1%).
Researcher Affiliation	Academia	Xudong Pan Fudan University xdpan18@fudan.edu.cn Shengyao Zhang Fudan University shengyaozhang21@m.fudan.edu.cn Mi Zhang B Fudan University mi_zhang@fudan.edu.cn Yifan Yan Fudan University yanyf20@fudan.edu.cn Min Yang B Fudan University m_yang@fudan.edu.cn
Pseudocode	Yes	We then invoke the primitive Fill(P, fk, vk) in Algorithm A.1 in the supplementary material to replace the original parameters in fk by parameters in P.
Open Source Code	Yes	To facilitate future research, we open-source our code in https://anonymous.4open.science/r/data-hiding-66D0/.
Open Datasets	Yes	CIFAR-10 [24]: This dataset contains 60, 000 images of daily objects (e.g., cat, trunk and ship). Face Scrub [29]: This dataset contains 107, 818 face images of 530 male and female celebrities retrieved from the Internet. Speech Command (i.e., Speech) [44]: This dataset contains 35 different voice commands spoken by multiple subjects, which is composed of over 100,000 audio files of 1 second length with a sampling frequency of 16k Hz.
Dataset Splits	No	The paper mentions training and evaluating on datasets (e.g., CIFAR-10, Face Scrub, Speech Command) but does not explicitly state the specific percentages or methods used for validation dataset splits in the main text.
Hardware Specification	No	The paper does not explicitly specify the hardware components (e.g., GPU models, CPU types, memory) used for running the experiments in the provided text.
Software Dependencies	No	The paper mentions using an optimizer 'Adam [21]' but does not provide specific version numbers for any software dependencies or libraries used in the experimental setup in the provided text.
Experiment Setup	Yes	In each secret task, we set the dimension of the pseudorandom noise vectors as 100 and the secret model as an off-the-shelf generator-like architecture which is detailed in the supplementary materials. We consider a standard Res Net-18 [19] as the carrier model, and the training on the CIFAR-10 [24] dataset as the open task. Subsequently, we invoke the Update primitive and resume the joint training to the next iteration.