Rankmax: An Adaptive Projection Alternative to the Softmax Function
Authors: Weiwei Kong, Walid Krichene, Nicolas Mayoraz, Steffen Rendle, Li Zhang
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, we studied how well Rankmax performs as a multilabel classification loss, and compared it to both Softmax and Sparsemax [26]. For evaluation, we chose a recommender system task where the goal is to learn which movies (=labels) to recommend to a user (=example). We experimented with Movielens datasets [15], namely the datasets of 100K, 20M, and 1B ratings, the latter being artificially generated from the 20M dataset [4]. |
| Researcher Affiliation | Collaboration | Weiwei Kong Georgia Institute of Technology wwkong@gatech.edu Walid Krichene Google Research walidk@google.com Nicolas Mayoraz Google Research nmayoraz@google.com Steffen Rendle Google Research srendle@google.com Li Zhang Google Research liqzhang@google.com |
| Pseudocode | Yes | Finally, the index t can be computed in O(n log k), as detailed in Algorithm 1 in the supplement. |
| Open Source Code | No | No concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) was provided for the methodology described in this paper. |
| Open Datasets | Yes | We experimented with Movielens datasets [15], namely the datasets of 100K, 20M, and 1B ratings, the latter being artificially generated from the 20M dataset [4]. Basic statistics about the datasets are summarized in Table 1. |
| Dataset Splits | Yes | The datasets were partitioned into 80% training, 10% cross-validation and 10% test. |
| Hardware Specification | No | No specific hardware details (exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments were provided. |
| Software Dependencies | No | No specific ancillary software details (e.g., library or solver names with version numbers) were provided. |
| Experiment Setup | Yes | Hyper-parameters were tuned based on the cross-validation set. Figure 2 illustrates the evolution of AP@10 and R@100 over the course of training, for different learning rates. |