Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees

Authors: Sen Na, Yuwei Luo, Zhuoran Yang, Zhaoran Wang, Mladen Kolar

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on synthetic and real data corroborate our theoretical results and illustrate flexibility of the proposed representation learning model. We show experimental results on synthetic and real-world data.
Researcher Affiliation Academia 1Department of Statistics, University of Chicago, Chicago IL, USA 2Department of Operations Research and Financial Engineering, Princeton University, Princeton NJ, USA 3Department of Industrial Engineering and Management Sciences, Northwestern University, Chicago IL, USA 4Booth School of Business, University of Chicago, Chicago IL, USA.
Pseudocode No The paper describes the gradient descent iteration but does not present it in a formally structured pseudocode or algorithm block.
Open Source Code No The paper does not provide concrete access to source code, nor does it state that the code for the methodology is released or available.
Open Datasets Yes We consider three datasets: Mushroom, Segment and Covtype (Dua & Graff, 2017)
Dataset Splits No The paper describes how samples are generated for synthetic data and how observation sets (Ω and Ω') are used, but does not specify explicit train/validation/test dataset splits with percentages, absolute counts, or references to predefined standard splits for reproducibility.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments, only general statements about computations.
Software Dependencies No The paper does not provide specific software dependencies or their version numbers that would be necessary to replicate the experiments.
Experiment Setup Yes We fix d = d1 = d2 = 50, r = 3, n1 = n2 = 400, and use Re LU as the activation function for NSMC and NIMC. For NSMC and SMC, we randomly generate two independent sample sets with m = 1000 observations... We fix d = d1 = d2 = 30, r = 2, n1 = n2 = 400, and choose tanh as the activation function. We generate features x and z independently from a Gaussian mixture model with four components... We sample y from a binomial model with NB = 20. We fix observed sample size m = 1000... We set the activation function φ to be tanh for all data sets. For NSMC, we first uniformly sample two independent sets of items with n1 = n2 = 1000. Then we generate independent observation sets Ωand Ω with size m = 5000.