Deep Reinforcement Learning Boosted Partial Domain Adaptation
Authors: Keyu Wu, Min Wu, Jianfei Yang, Zhenghua Chen, Zhengguo Li, Xiaoli Li
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on several benchmark datasets demonstrate the superiority of the proposed DRL-based data selector which leads to state-of-the-art performance for various PDA tasks. |
| Researcher Affiliation | Collaboration | Keyu Wu1 , Min Wu1 , Jianfei Yang2 , Zhenghua Chen1 , Zhengguo Li1 and Xiaoli Li1,2 1Institute for Infocomm Research , A*STAR, Singapore 2Nanyang Technological University |
| Pseudocode | No | The paper describes methods using equations and prose but does not include a structured pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide an explicit statement or link to open-source code for the described methodology. |
| Open Datasets | Yes | The proposed method is evaluated on three datasets. Office-31 [Saenko et al., 2010] is a widely used domain adaptation dataset... Office-Home [Venkateswara et al., 2017] is a more challenging dataset... Vis DA2017 [Peng et al., 2017] is a large-scale dataset... |
| Dataset Splits | No | The paper mentions following settings from previous works for creating PDA tasks but does not explicitly detail the train/validation/test dataset splits used for their experiments. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running its experiments. |
| Software Dependencies | No | The paper mentions optimizers (SGD, Adam) and pre-trained models (ResNet-50) but does not provide specific version numbers for any software dependencies like programming languages, frameworks, or libraries. |
| Experiment Setup | Yes | The DRL-DS module consists of one common fully connected layer, two noisy fully connected layers for the state value stream and two noisy fully connected layers for the advantage stream. During training, the backbone network, Res Net-50, is pre-trained on Image Net while the other layers are trained from scratch. The DA module is trained via SGD with a batch size of 32 and a learning rate of 1e-3. The DRL-DS module is trained using Adam with a batch size of 32 and a learning rate of 1e-4. The episode length is set to five. The discount factor in Eq. 10 is set to 0.9 and λ1, λ2, λ3 and λ4, in Eq. 6 are set to 0.2, 0.5, 1.8 and 0.5, respectively. |