reproducibilityindex.ai

Federated Neuro-Symbolic Learning

Authors: Pengwei Xing, Songtao Lu, Han Yu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments based on both synthetic and real-world data demonstrate significant advantages of Fed NSL compared to five state-of-the-art methods. It outperforms the best baseline by 17% and 29% in terms of unbalanced average training accuracy and unseen average testing accuracy, respectively.
Researcher Affiliation	Collaboration	1College of Computing and Data Science, Nanyang Technological University, Singapore 2IBM Thomas J. Watson Research Center Yorktown Heights, USA. Correspondence to: Pengwei Xing <pengwei001@e.ntu.edu.sg>, Songtao Lu <songtao@ibm.com>, Han Yu <han.yu@ntu.edu.sg>.
Pseudocode	Yes	Algorithm 1 Fed NSL
Open Source Code	No	The paper does not provide an explicit statement about releasing source code or a direct link to a code repository.
Open Datasets	Yes	For the real-data experiment, we utilize a document-level DWIE (Document-Level Web Information Extraction) dataset (Zaporojets et al., 2021) that has been pre-processed following the methods outlined in (Ru et al., 2021).
Dataset Splits	No	The paper specifies training and testing splits ('within each client, the documents are further divided into training and testing subsets, maintaining a 3 : 1 ratio') but does not explicitly mention a separate validation split or how it's handled.
Hardware Specification	No	The paper does not explicitly describe the specific hardware used (e.g., CPU, GPU models) to run its experiments.
Software Dependencies	No	The paper mentions 'a transformer model' and 'neural network classifiers' but does not specify any software dependencies with version numbers (e.g., PyTorch 1.x, Python 3.x, CUDA).
Experiment Setup	Yes	For numeric experiment, we use an integrated model setup combining deep learning classifiers with a GMM tailored for a federated learning context. The setup features two neural network classifiers, each with an input dimension of 2 to accommodate the two-dimensional features of our synthetic dataset, a hidden layer comprising 64 units to capture complex data patterns without overfitting, and an output layer with 3 units equipped with a softmax function for 3-class classification. Parallelly, the GMM is configured with 3 components to correspond with the dataset s 3 classes, where the means are initialized based on preliminary data analysis or classifier outputs, and covariance matrices are set to reflect initial data variance, facilitating adaptive learning through the EM processing.