A Semantic Loss Function for Deep Learning with Symbolic Knowledge
Authors: Jingyi Xu, Zilu Zhang, Tal Friedman, Yitao Liang, Guy Broeck
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | An experimental evaluation shows that it effectively guides the learner to achieve (near-)state-of-the-art results on semi-supervised multi-class classification. Moreover, it significantly increases the ability of the neural network to predict structured objects, such as rankings and paths. These discrete concepts are tremendously difficult to learn, and benefit from a tight integration of deep learning and symbolic reasoning methods. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA 2Peking University, Beijing, China. |
| Pseudocode | No | The paper includes diagrams of circuits (Figure 3, Figure 4) but no structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code to reproduce all the experiments in this paper can be found at https://github.com/UCLA-Star AI/ Semantic-Loss/. |
| Open Datasets | Yes | Specifically, the MNIST base model is a fully-connected multilayer perceptron (MLP), with layers of size 784-1000-500-250-250-250-10. On CIFAR-10, it is a 10-layer convolutional neural network (CNN) with 3-by-3 padded filters. ... MNIST, FASHION, and CIFAR-10 datasets. |
| Dataset Splits | Yes | For all semi-supervised experiments, we use the standard 10,000 held-out test examples provided in the original datasets and randomly pick 10,000 from the standard 60,000 training examples (50,000 for CIFAR-10) as validation set. ... Grids ... with a 60/20/20 train/validation/test split. ... Preference Learning ... We again split the data 60/20/20 into train/test/split |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments (e.g., GPU/CPU models, memory specifications). |
| Software Dependencies | No | The paper mentions techniques and algorithms like "Re Lu (Nair & Hinton, 2010), batch normalization (Ioffe & Szegedy, 2015), and Adam optimization (Kingma & Ba, 2015)", but it does not specify any software libraries or frameworks with their version numbers that would be necessary for replication. |
| Experiment Setup | Yes | Specifically, the MNIST base model is a fully-connected multilayer perceptron (MLP), with layers of size 784-1000-500-250-250-250-10. On CIFAR-10, it is a 10-layer convolutional neural network (CNN) with 3-by-3 padded filters. After every 3 layers, features are subject to a 2-by2 max-pool layer with strides of 2. Furthermore, we use Re Lu (Nair & Hinton, 2010), batch normalization (Ioffe & Szegedy, 2015), and Adam optimization (Kingma & Ba, 2015) with a learning rate of 0.002. ... When evaluating on MNIST, we run experiments for 20 epochs, with a batch size of 10. ... We use a batch size of 100 samples of which half are unlabeled. Experiments are run for 100 epochs. |