Meta-Learning for Relative Density-Ratio Estimation
Authors: Atsutoshi Kumagai, Tomoharu Iwata, Yasuhiro Fujiwara
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we demonstrate the effectiveness of the proposed method with three problems: relative DRE, dataset comparison, and inlier-based outlier detection. All experiments were conducted on a Linux server with an Intel Xeon CPU and a NVIDIA Ge Force GTX 1080 GPU. |
| Researcher Affiliation | Industry | Atsutoshi Kumagai NTT Computer and Data Science Laboratories atsutoshi.kumagai.ht@hco.ntt.co.jp Tomoharu Iwata NTT Communication Science Laboratories tomoharu.iwata.gy@hco.ntt.co.jp Yasuhiro Fujiwara NTT Communication Science Laboratories yasuhiro.fujiwara.kh@hco.ntt.co.jp |
| Pseudocode | Yes | Algorithm 1 Training procedure of our model. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a direct link to a code repository for the described methodology. |
| Open Datasets | Yes | Mnist-r1 and Isolet2, which have been commonly used in transfer or multi-task learning studies [11, 27, 23]." and footnotes: "1 https://github.com/ghif/mtae 2 http://archive.ics.uci.edu/ml/datasets/ISOLET", "Data We used three real-world benchmark data: Io T3, Landmine4, and School5." and footnotes: "3 https://archive.ics.uci.edu/ml/datasets/detection_of_IoT_botnet_attacks_N_Ba_IoT 4 http://people.ee.duke.edu/~lcarin/Landmine_Data.zip 5 http://multilevel.ioe.ac.uk/intro/datasets.html" |
| Dataset Splits | Yes | Hyperparameters were determined based on the empirical squared error for relative DRE on validation data. The squared error on validation data was used for early stopping to avoid over-fitting, where the maximum number of training iterations was 10,000." and "We created 600 source, 3 validation, 20 target datasets and evaluated the mean test squared error of all target dataset pairs when the number of target support instances was NS = 10. |
| Hardware Specification | Yes | All experiments were conducted on a Linux server with an Intel Xeon CPU and a NVIDIA Ge Force GTX 1080 GPU. |
| Software Dependencies | No | The paper states 'We implemented all methods by Py Torch [32]' but does not provide a specific version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | For all problems, a three(two)-layered feed-forward neural network was used for f (g) in Eq. (1). For f, the number of output and hidden nodes was 100, and Re LU activation was used. For h in Eq. (2), a three-layered feed-forward neural network with 100 hidden nodes with Re LU activation and 100 output nodes (T = 100) with the Softplus function was used. Hyperparameters were determined based on the empirical squared error for relative DRE on validation data. The dimension of latent vectors z was chosen from {4, 8, 16, 32, 64, 128, 256}. Relative parameter α was set to 0.5, which is a value recommended in a previous study [49]. We used the Adam optimizer [20] with a learning rate of 0.001. The mini-batch size was set to 256 (i.e., NQ = 128 for numerator and denominator instances). In training with source datasets, support instances are included in query instances as in [9, 10]. The squared error on validation data was used for early stopping to avoid over-fitting, where the maximum number of training iterations was 10,000. |