Generalization Error Bound for Hyperbolic Ordinal Embedding
Authors: Atsushi Suzuki, Atsushi Nitanda, Jing Wang, Linchuan Xu, Kenji Yamanishi, Marc Cavazza
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compared EOE and HOE on two types of ordinal datasets (R2 and Star). The ordinal data are generated according to (15) and (11), where f is given by (28) with α = 0.25 and is defined by a metric. The metric for R2 is the distance matrix of 20 points in R2, where we expect both HOE and EOE to have small Rz (z n)N n=1 . The metric for Star is the distance matrix of a 20-star graph with random weights, where we expect HOE to have smaller Rz (z n)N n=1 than EOE since a star graph is a special tree. We set ψ(x) = x2 for EOE and ψ(x) = cosh x for HOE. The ordinal data size is S = 100, 200, 400, 800, 1600. We have set both batch size and the number of epoch in stochastic gradient descent to 1000. The learning rate has been selected from {0.1, 1.0, 10.0} by grid-search, following (Suzuki et al., 2019). We have run 10 times for each dataset and report the average error in Table 1. EOE obtains smaller errors than HOE in R2, owing to EOE s smaller excess risk. In Star, HOE shows larger errors than HOE for small Ss (100,200,400) but smaller errors than HOE for large Ss (800.1600). This is inline with the analysis in Section 3.4. Table 1. Triplet classification error (%). |
| Researcher Affiliation | Academia | 1School of Computing and Mathematical Sciences, Faculty of Liberal Arts and Sciences, University of Greenwich, United Kingdom. 2Department of Artificial Intelligence, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, Japan. 3Department of Computing, The Hong Kong Polytechnic University, Hong Kong. 4Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Japan. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | No | The paper describes generating its own ordinal data sets based on specific metrics ('R2' and 'Star' datasets), but does not provide access information (link, DOI, formal citation with author/year for public release) for these generated datasets or any other publicly available datasets used in the experiments. |
| Dataset Splits | No | The paper does not provide specific dataset split information (e.g., percentages or counts for training, validation, and test sets) needed to reproduce the data partitioning. It only mentions the total ordinal data size S for experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | The ordinal data size is S = 100, 200, 400, 800, 1600. We have set both batch size and the number of epoch in stochastic gradient descent to 1000. The learning rate has been selected from {0.1, 1.0, 10.0} by grid-search, following (Suzuki et al., 2019). |