Learning with Explanation Constraints

Authors: Rattana Pukdee, Dylan Sam, J. Zico Kolter, Maria-Florina F. Balcan, Pradeep Ravikumar

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.
Researcher Affiliation Collaboration Rattana Pukdee Carnegie Mellon University rpukdee@cs.cmu.edu Dylan Sam Carnegie Mellon University dylansam@andrew.cmu.edu J. Zico Kolter Carnegie Mellon University Bosch Center for AI zkolter@cs.cmu.edu Maria-Florina Balcan Carnegie Mellon University ninamf@cs.cmu.edu Pradeep Ravikumar Carnegie Mellon University pkr@cs.cmu.edu
Pseudocode Yes Algorithm 1 Algorithm for identifying parameters of a two layer neural network, given exact gradient constraints
Open Source Code No code to replicate our experiments will be released with the full paper.
Open Datasets Yes We present classification tasks (Figure 5) from a weak supervision benchmark [46]. [46] J. Zhang, Y. Yu, Y. Li, Y. Wang, Y. Yang, M. Yang, and A. Ratner. Wrench: A comprehensive benchmark for weak supervision. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021.
Dataset Splits No The paper mentions using 'Test splits' from the WRENCH benchmark but does not specify the exact percentages, sample counts, or explicit methodology for these splits within the paper itself.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup Yes For all of our synthetic and real-world experiments, we use values of m = 1000, k = 20, T = 3, τ = 0, λ = 1, unless otherwise noted. For our synthetic experiments, we use d = 100, σ2 = 5. Our two layer neural networks have hidden dimensions of size 10. They are trained with a learning rate of 0.01 for 50 epochs. For our real-world data, our two layer neural networks have a hidden dimension of size 10 and are trained with a learning rate of 0.1 (You Tube) and 0.1 (Yelp) for 10 epochs. λ = 0.01 and gradient values computed by the smoothed approximation in [? ] has c = 1.