Fast Algorithm for Generalized Multinomial Models with Ranking Data
Authors: Jiaqi Gu, Guosheng Yin
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments on synthetic data and real data demonstrate the advantages of our Markov chain based algorithm over existing ones. Our algorithm converges to the MLE with fewer iterations and at a faster convergence rate. |
| Researcher Affiliation | Academia | Jiaqi Gu 1 Guosheng Yin 1 1Department of Statistics and Actuarial Science, University of Hong Kong, Hong Kong SAR, China. Correspondence to: Jiaqi Gu <u3005743@hku.hk>, Guosheng Yin <gyin@hku.hk>. |
| Pseudocode | Yes | Algorithm 1 Markov chain based algorithm Input: Observations (Aj, Cj) : j = 1, . . . , n and calculate {Wi, Li} for each ci. Initialize p = (1/d, . . . , 1/d)T . Initialize Σ(p) = 0d d. repeat for i {1, . . . , d} do for i {1, . . . , d}\{i} do pi q+ j q j end for end for Compute σii(p) (i = 1, . . . , d) and then normalize Σ(p) so that i, Pd i =1 σii (p) = 1. p T(p) under the transition matrix Σ(p). until convergence. |
| Open Source Code | No | The paper does not explicitly provide a link or statement confirming the availability of its source code. |
| Open Datasets | Yes | We first study the performance of these algorithms on the NASCAR data (Hunter, 2004). This dataset contains the full rankings of drivers in 36 NASCAR racing rounds in 2002, where 43 out of a total of 87 drivers competed in each round. (...) To compare the empirical performance and scalability of Algorithm 1 with existing methods, we apply them to the well-known sushi datasets (Kamishima, 2003) and a Hong Kong Jockey Club (HKJC) horse racing dataset. |
| Dataset Splits | No | The paper describes the total sample sizes and replication counts for experiments but does not specify train, validation, or test data splits. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers. |
| Experiment Setup | Yes | We carry out experiments with d = 8 basic cells, 4 cases of experimental setups, and 1,000 samples per case. The sample size is set as n = 280 2k, where k takes values from 0 to 9 to allow n to increase. (...) The average number of iterations needed for different algorithms to converge under different levels of small union incompleteness is exhibited in Figure 4. It shows that the lower the ratio |C|/d, the larger number of iterations other algorithms need to converge, while the number of iterations for our algorithm to converge is stable as the ratio |C|/d changes. Even in the most severe situation where |C|/d is far smaller than 1, the number of iterations is less than 10. Thus, our algorithm can resist the negative impact of small union incompleteness in data. |