Automatic Outlier Rectification via Optimal Transport

Authors: Jose Blanchet, Jiajin Li, Markus Pelger, Greg Zanotti

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our approach over conventional approaches in simulations and empirical analyses for mean estimation, least absolute regression, and the fitting of option implied volatility surfaces. In this section, we demonstrate the effectiveness of the proposed statistically robust estimator through various tasks: mean estimation, LAD regression, and two applications to volatility surface modeling.
Researcher Affiliation Academia Jose Blanchet Dept. of Management Science & Engineering Stanford University jose.blanchet@stanford.edu Jiajin Li Sauder School of Business University of British Columbia jiajin.li@sauder.ubc.ca Markus Pelger Dept. of Management Science & Engineering Stanford University mpelger@stanford.edu Greg Zanotti Dept. of Management Science & Engineering Stanford University gzanotti@stanford.edu
Pseudocode Yes Algorithm 1: Statistically Robust Estimator and Algorithm 2: Statistically Robust Optimization Procedure
Open Source Code No The work in this paper was conducted with an industry partner, and the code and data are proprietary.
Open Datasets Yes We select the data set from Chataigner et al. [2020] consisting of (options chain, surface) pairs from the German DAX index.
Dataset Splits Yes To perform cross-validation, we split the train day into a training and validation sample, which we use to obtain estimates of MAPE and ˆS as a function of δ. for each day, we sample 80% of the training set without replacement as a cross-validation (CV) training set, and use the remaining 20% of the training set as a CV validation set.
Hardware Specification Yes Experiments are run on a server with a Xeon E5-2398 v3 processor and 756GB of RAM.
Software Dependencies No The paper mentions 'Py Torch', 'Tensorflow', and 'JAX' but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes The hyperparameters for our estimator, namely δ = 0.5 and r = 0.5, remained constant across all corrupted levels. For our estimator, we estimate the surface by subgradient descent with learning rate α = 10 1 and r = 0.5, terminating when the relative change in loss reaches 10 5. In each trial, we run our optimization procedure for with a learning rate of 10 2. We stop when the number of iterations reaches 2000 or the change in the loss function between successive iterations is below a tolerance of 10 6. We initialize θ to the median of the data set.