Adversarial Regression with Doubly Non-negative Weighting Matrices

Authors: Tam Le, Truyen Nguyen, Makoto Yamada, Jose Blanchet, Viet Anh Nguyen

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments show that our reweighting strategy delivers promising results on numerous datasets. We evaluate our adversarial reweighting schemes on the conditional expectation estimation task. To this end, we use the proposed reweighted scheme on the NW estimator of Example 1.1. For each dataset, we randomly split 1200 samples for training, 50 samples for validation to choose the bandwith h of the Gaussian kernel, and 800 samples for test.
Researcher Affiliation Collaboration Tam Le RIKEN AIP tam.le@riken.jp Truyen Nguyen University of Akron tnguyen@uakron.edu Makoto Yamada Kyoto University and RIKEN AIP makoto.yamada@riken.jp Jose Blanchet Stanford University jose.blanchet@stanford.edu Viet Anh Nguyen Stanford University and Vin AI Research v.anhnv81@vinai.io
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes We have released code for these proposed tools2. 2https://github.com/lttam/Adversarial-Regression
Open Datasets Yes We use 8 real-world datasets: (i) abalone (Abalone), (ii) bank-32fh (Bank), (iii) cpu (CPU), (iv) kin40k (KIN), (v) elevators (Elevators), (vi) pol (POL), (vii) pumadyn32nm (PUMA), and (viii) slice (Slice) from the Delve datasets, the UCI datasets, the KEEL datasets and datasets in Noh et al. [27].
Dataset Splits Yes For each dataset, we randomly split 1200 samples for training, 50 samples for validation to choose the bandwith h of the Gaussian kernel, and 800 samples for test.
Hardware Specification No All our experiments are run on commodity hardware.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes Setup. For each dataset, we randomly split 1200 samples for training, 50 samples for validation to choose the bandwith h of the Gaussian kernel, and 800 samples for test. More specially, we choose the squared bandwidth h2 for the Gaussian kernel from a predefined set 10 2:1:4, 2 10 2:1:4, 5 10 2:1:4 . For a tractable estimation, we follow the approach in Brundsdon et al. [7] and Silverman [35] to restrict the relevant samples to N nearest neighbors of each test sample zi with N {10, 20, 30, 50}. The range of the radius ρ has 4 different values ρ {0.01, 0.1, 1, 10}. Finally, the prediction error is measured by the root mean square error (RMSE), i.e., RMSE = q n 1 t Pnt i=1(byi bβi)2, where nt is the test sample size (i.e., nt =800) and bβi is the conditional expectation estimate at the test sample zi. We repeat the above procedure 10 times to obtain the average RMSE.