Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Improved Shrinkage Prediction under a Spiked Covariance Structure
Authors: Trambak Banerjee, Gourab Mukherjee, Debashis Paul
JMLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present simulation experiments as well as real data examples illustrating the efficacy of the proposed method. |
| Researcher Affiliation | Academia | Trambak Banerjee EMAIL Analytics, Information and Operations Management University of Kansas Lawrence, KS 66045, USA Gourab Mukherjee EMAIL Data Sciences and Operations University of Southern California Los Angeles, CA 90089, USA Debashis Paul EMAIL Department of Statistics University of California, Davis Davis, CA 95616, USA |
| Pseudocode | No | The paper describes its methodology in narrative text and mathematical formulations (e.g., Section 3. Proposed methodology for disaggregated model) but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | The R package casp has been developed to implement our proposed CASP methodology in aggregated as well as disaggregated prediction problems. It is publicly available at the following Git Hub repository: https://github.com/trambakbanerjee/casp. |
| Open Datasets | Yes | In this section we analyze a part of the dataset published by Bronnenberg et al. (2008). |
| Dataset Splits | Yes | We use 3 weeks from a relatively recent snapshot covering October 31, 2011 to November 20, 2011 as data from the current model... We use the most recent T = 2 weeks, from November 7, 2011 to November 20, 2011 as our prediction period and utilize the sales data of week t 1 to predict the state aggregated totals for week t where t = 1, . . . , T. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions the use of 'R package casp', 'R package esa Bcv', 'R package FACTMLE', 'R-package POET', and 'R-package splines2' but does not specify their version numbers. |
| Experiment Setup | Yes | In the setup of experiment 1 we investigate the prediction performance of the five predictive rules under the disaggregated model (A = In) and sample θ from an n = 200 variate Gaussian distribution with mean vector η0 = 0 and covariance τΣβ. We impose a spike covariance structure on Σ with K = 10 spikes under the following two scenarios with l0 fixed at 1. Scenario 1: we consider the generalized absolute loss function in equation (3) with bi sampled uniformly between (0.9, 0.95), hi = 1 bi with (τ, β) = (0.5, 0.25) and K spikes equi-spaced between 80 and 20. Scenario 2: we consider the Linex loss function in equation (4) with ai sampled uniformly between ( 2, 1), bi = 1 with (τ, β) = (1, 1.75) and K spikes equi-spaced between 25 and 5. For our prediction problem, we use a threshold of sp units for product p and consider only those outlets that have sold at least sp units in week 0. In particular, we use the function smooth.spline from the R-package splines2 and choose k = 3 knots corresponding to the 25, 50 and 95 percentiles of the sales distribution across the np stores at each of the m weeks. |