Semi-Supervised Learning with Declaratively Specified Entropy Constraints
Authors: Haitian Sun, William W. Cohen, Lidong Bing
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Compared to prior frameworks for specifying SSL techniques, our technique achieves consistent improvements on a suite of well-studied SSL benchmarks, and obtains a new state-of-the-art result on a difficult relation extraction task. In the next section, we first introduce several SSL constraints in a uniform notation. Then, in Section 3, we experiment with some benchmark text categorization tasks, to illustrate the effectiveness of the constraints. Finally, in Section 4, we generalize our model to a difficult relation extraction task in drug and disease domains, where we obtain a state-of-the-art results using this framework. |
| Researcher Affiliation | Collaboration | Haitian Sun Machine Learning Department Carnegie Mellon University Pittsburgh, PA 15213 haitians@cs.cmu.edu Lidong Bing R&D Center Singapore Machine Intelligence Technology Alibaba DAMO Academy l.bing@alibaba-inc.com William W. Cohen Machine Learning Department Carnegie Mellon University Pittsburgh, PA 1513 wcohen@cs.cmu.edu |
| Pseudocode | No | The paper describes rules in a declarative language (Tensor Log) and provides diagrams of computational graphs, but it does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper provides links to an open-source Bayesian optimization tool used by the authors, data splits, and code for baseline models, but it does not provide an explicit statement or link to the open-source code for the DCE-Learner developed in this paper. |
| Open Datasets | Yes | Following [3], we consider SSL performance on three widely-used benchmark datasets for classification of hyperlinked text: Citeseer, Cora, and Pub Med [21]. We ran experiments on two datasets in the drug and disease domains, respectively: Daily Med with 28,590 articles and Wiki Disease with 8,596 articles. These datasets are described in [5]. We directly employ the preprocessed corpora from [5] 4, which contains shallow features such as tokens from the sentence containing the noun phrase and unigram/bigrams from a window around it, and also features derived from dependency parses. (Footnote 4: Data available at http://www.cs.cmu.edu/~lbing/#data_aaai-2016_diebolds) |
| Dataset Splits | No | The paper states 'We take 20 labeled examples for each class as training data, and reserve 1,000 examples as test data. Other examples are treated as unlabeled.' and 'wi s are hyper-parameters that will be tuned with Bayesian Optimization [22]'. While Bayesian Optimization implies the use of a validation set, the paper does not explicitly describe the specific data split used for validation. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions software components like Tensor Log, TensorFlow, and Spearmint, but does not provide specific version numbers for any of them, which is required for a reproducible description. |
| Experiment Setup | Yes | During training, we have five losses: supervised classification, ER, NBER, LPER, and COLPER. These are combined with different weights of importance: ltotal = lpredict + w1 l ER + w2 l NBER + w3 l LPER + w4 l COLPER where wi s are hyper-parameters that will be tuned with Bayesian Optimization [22]. In the drug domain, we take 500 relation and type examples for each class as labeled data and randomly select 2,000 unlabeled examples for each constraint. In the disease domain, we take 2,000 labeled relation and type examples for each class and 4,000 unlabeled examples for each constraint. |