Learning Multiple Maps from Conditional Ordinal Triplets

Authors: Dung D. Le, Hady W. Lauw

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on public datasets showcase the utility of collaborative learning over baselines that learn multiple maps independently. Our objective is primarily to investigate the effectiveness of multiple maps for conditional or dinal embedding. 5.1 Experimental Setup We experiment with three public datasets that could model varying perceptions of similarity.
Researcher Affiliation Academia Dung D. Le and Hady W. Lauw School of Information Systems, Singapore Management University, Singapore {ddle.2015, hadywlauw}@smu.edu.sg
Pseudocode Yes Algorithm 1 SCORE 1: Initialize xt for t T and yi for i I. 2: While not converged 3: Draw a triplet i, j, k t randomly from N. 4: Compute the likelihood: 5: lt ijk = δt.σt ijk + (1 δt)σijk. 6: Compute the partial derivatives: 7: z L z for each z {xt, yi, yj, yk} 8: Update the model parameters: 9: z Rz (ϵ.Projz ( z)), for z {xt, yi, yj, yk} ; 10: δt δt + ϵ. σt ijk σijk ; δt = arg min δ [0,1] |δt δ|; 11: Return {xt}t T and {yi}i I.
Open Source Code No The paper does not provide an explicit statement about the release of their source code (SCORE) or a link to a code repository for their method. It only provides links for comparative methods.
Open Datasets Yes We experiment with three public datasets that could model varying perceptions of similarity. Zoo2 contains 17 attributes of 101 animals (excluding animal name). We model each attribute as a similarity aspect. For attribute t, we form the triplet i, j, k t if i and j have the same attribute value, which is different from k. There are 3.24 106 triplets. Congressional Voting Records (or House Vote)3 contains 435 instances (congressmen) and 16 attributes (voting issues). After excluding instances with missing values, we get 232 fully-observed instances of 16 attributes. We generate triplets in the same way as we do with Zoo dataset. That induces totally 2.4 107 triplets. Paris Attractions4 contains 237 users organizing 250 Paris attractions into clusters. With user as aspect, we induce 3.48 105 triplets, each involves two attractions i and j that the user puts into the same cluster, and another attraction k in a different cluster. As in [Yue et al., 2014], we exclude attractions uninteresting to users.
Dataset Splits Yes The preservation accuracy for an aspect t is the fraction of its ordinal triplets Nt for which t s coordinates reflect the correct direction. The fewer the violated triplets, the higher is the accuracy. The overall accuracy is the average of aspects preservation accuracies (Eq. 10): | i, j, k t Nt : ||yt j yt i|| < ||yt j yt k||}| |Nt| , (10) yt i, yt j, yt k are t s embedding coordinates of objects i, j, k. Since in practice we may not observe all triplets or even all objects beforehand, we sample a fraction r (split ratio) of objects for each aspect, then evaluate the coordinates against the full set of triplets. As the default for this study, we set r = 0.5, which has a relative balance between the information that an aspect sees and the information that it could learn from others. Later we will also investigate the effects of different r values. We average the results across 30 random samples.
Hardware Specification Yes For the Paris Attractions, including all aspects, SCORE takes 5 minutes on a PC with Intel Core i5 3.2 GHz CPU and 12 GB RAM.
Software Dependencies No The paper mentions tools used for comparative methods like SOE (R package 'loe') and t-STE, but does not provide specific version numbers for these or any other software dependencies relevant to their own implementation (SCORE).
Experiment Setup Yes We tune the parameters of all methods for their best performances on the training data. For SCORE, the setting is κ = 10 3 for Paris Attractions, and 0 for Zoo and House Vote, v MF mean vector µ = (0, 0, 1), the learning rate ϵ = 0.05, and the scaling factor α = 30. For SOE, the scaling factor is 0.1 for all the datasets. For t STE, the learning rate and regularization parameter are 2 and 0 respectively for all datasets. For MVTE, the learning rate is 1 for all datasets. For MVMDS, γ = 5 for all datasets.