Meta-Learning for Relative Density-Ratio Estimation

Authors: Atsutoshi Kumagai, Tomoharu Iwata, Yasuhiro Fujiwara

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we demonstrate the effectiveness of the proposed method with three problems: relative DRE, dataset comparison, and inlier-based outlier detection. All experiments were conducted on a Linux server with an Intel Xeon CPU and a NVIDIA Ge Force GTX 1080 GPU.
Researcher Affiliation Industry Atsutoshi Kumagai NTT Computer and Data Science Laboratories atsutoshi.kumagai.ht@hco.ntt.co.jp Tomoharu Iwata NTT Communication Science Laboratories tomoharu.iwata.gy@hco.ntt.co.jp Yasuhiro Fujiwara NTT Communication Science Laboratories yasuhiro.fujiwara.kh@hco.ntt.co.jp
Pseudocode Yes Algorithm 1 Training procedure of our model.
Open Source Code No The paper does not provide an explicit statement about releasing source code or a direct link to a code repository for the described methodology.
Open Datasets Yes Mnist-r1 and Isolet2, which have been commonly used in transfer or multi-task learning studies [11, 27, 23]." and footnotes: "1 https://github.com/ghif/mtae 2 http://archive.ics.uci.edu/ml/datasets/ISOLET", "Data We used three real-world benchmark data: Io T3, Landmine4, and School5." and footnotes: "3 https://archive.ics.uci.edu/ml/datasets/detection_of_IoT_botnet_attacks_N_Ba_IoT 4 http://people.ee.duke.edu/~lcarin/Landmine_Data.zip 5 http://multilevel.ioe.ac.uk/intro/datasets.html"
Dataset Splits Yes Hyperparameters were determined based on the empirical squared error for relative DRE on validation data. The squared error on validation data was used for early stopping to avoid over-fitting, where the maximum number of training iterations was 10,000." and "We created 600 source, 3 validation, 20 target datasets and evaluated the mean test squared error of all target dataset pairs when the number of target support instances was NS = 10.
Hardware Specification Yes All experiments were conducted on a Linux server with an Intel Xeon CPU and a NVIDIA Ge Force GTX 1080 GPU.
Software Dependencies No The paper states 'We implemented all methods by Py Torch [32]' but does not provide a specific version number for PyTorch or any other software dependencies.
Experiment Setup Yes For all problems, a three(two)-layered feed-forward neural network was used for f (g) in Eq. (1). For f, the number of output and hidden nodes was 100, and Re LU activation was used. For h in Eq. (2), a three-layered feed-forward neural network with 100 hidden nodes with Re LU activation and 100 output nodes (T = 100) with the Softplus function was used. Hyperparameters were determined based on the empirical squared error for relative DRE on validation data. The dimension of latent vectors z was chosen from {4, 8, 16, 32, 64, 128, 256}. Relative parameter α was set to 0.5, which is a value recommended in a previous study [49]. We used the Adam optimizer [20] with a learning rate of 0.001. The mini-batch size was set to 256 (i.e., NQ = 128 for numerator and denominator instances). In training with source datasets, support instances are included in query instances as in [9, 10]. The squared error on validation data was used for early stopping to avoid over-fitting, where the maximum number of training iterations was 10,000.