Representer Point Selection for Explaining Regularized High-dimensional Models
Authors: Che-Ping Tsai, Jiong Zhang, Hsiang-Fu Yu, Eli Chien, Cho-Jui Hsieh, Pradeep Kumar Ravikumar
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we study the empirical performance of our proposed methods on three real-world binary classification datasets and two recommender system datasets. We also showcase the utility of high-dimensional representers in explaining model recommendations. |
| Researcher Affiliation | Collaboration | 1Carnegie Mellon University 2Amazon, USA 3University of Illinois Urbana-Champaign 4University of California, Los Angeles. |
| Pseudocode | Yes | Algorithm 1 Computation of high-dimensional representers for Collaborative Filtering |
| Open Source Code | No | The paper does not include a direct link to open-source code for the described methodology or an explicit statement that the code is being released publicly. |
| Open Datasets | Yes | We use the following three datasets on binary classification. (1) 20 newsgroups1 [...] (2) Gisette (Guyon et al., 2004) [...] (3) Rcv1 (Lewis et al., 2004) [...]. Datasets: (1) Movielens-1M (Harper & Konstan, 2015): [...] (2) Amazon review (2018) (Ni et al., 2019): |
| Dataset Splits | Yes | We randomly split 10% data for the test set. [...] It contains 6,000/1,000 samples with each containing 5,000 features for training/testing. [...] For every user, we randomly held out two items ratings to construct the validation and test sets. |
| Hardware Specification | No | The paper mentions runtime 'on a single CPU' but does not provide specific details such as the CPU model, number of cores, or other hardware specifications used for the experiments. |
| Software Dependencies | No | The paper mentions 'LIBLINEAR (Fan et al., 2008)' but does not provide specific version numbers for this or any other software dependencies, which are necessary for full reproducibility. |
| Experiment Setup | Yes | We set max iterations to 20 and embedding dimension to 12 on the Movie Lens-1M dataset. [...] We use SGD optimizer with learning rate 2.0/15.0 with batch size 3000/3000 to train MF model for 10/10 epochs [...] For Movie Lens-1M/Amazon reviews 2018, we use Adam optimizer with learning rate 0.001/0.001 with batch size 3000/3000 to train Youtube Net for 20/10 epochs. We use an embedding of 64/16 trainable parameters to model user and item information. The user feature encoder consists of 4/3 layers of size 64/16 with 0.2/0.2 dropout probabilities. |