Maximum-Variance Total Variation Denoising for Interpretable Spatial Smoothing
Authors: Wesley Tansey, Jesse Thomason, James Scott
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare results on a suite of both synthetic and real-world datasets. We first compare MVTV against two benchmark methods with sharp partitions, CART and CRISP, on a synthetic dataset with varying sample sizes. We also compare against CRISP with q fixed at the maximum variance solution in a method we call MV-CRISP. We show that the MVTV method leads to better Akaike information criterion (AIC) scores. We then demonstrate the advantage of the maximum variance criterion by showing that it chooses grid sizes that offer a good trade-off between average and worst-cell accuracy. Finally, we test all four methods against two real-world datasets of crime reports for Austin and Chicago. A human evaluation on the results for Austin shows that the MV* methods are most interpretable. |
| Researcher Affiliation | Academia | Wesley Tansey* Columbia University New York, NY 10027 wt2274@cumc.columbia.edu Jesse Thomason, James G. Scott University of Texas at Austin Austin, TX 78712 jesse@cs.utexas.edu james.scott@mccombs.utexas.edu |
| Pseudocode | No | The paper describes the algorithm steps in text but does not include a structured pseudocode block or a clearly labeled algorithm section. |
| Open Source Code | No | The paper does not provide any specific links to source code for the methodology or state that the code is publicly available. |
| Open Datasets | Yes | We applied all four methods to a dataset of publicly-available crime report counts3 in Austin, Texas in 2014 and Chicago, Illinois in 2015. 3https://www.data.gov/open-gov/ |
| Dataset Splits | Yes | For both CRISP and the MV* methods, we chose λ via 5-fold cross validation across a log-space grid of 50 values. [...] We ran a 20-fold cross-validation to measure RMSE |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions the "R package rpart" but does not specify its version number or any other software dependencies with version details. |
| Experiment Setup | Yes | For CRISP, we use q = max(n, 100) as per the suggestions in (Petersen, Simon, and Witten 2016); for the MV* methods, we use the maximum variance criterion to choose from q [2, 50]. For both CRISP and the MV* methods, we chose λ via 5-fold cross validation across a log-space grid of 50 values. |