Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Deep Reasoning Networks for Unsupervised Pattern De-mixing with Constraint Reasoning
Authors: Di Chen, Yiwei Bai, Wenting Zhao, Sebastian Ament, John Gregoire, Carla Gomes
ICML 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the power of DRNets on two pattern de-mixing tasks disentangling two overlapping hand-written Sudokus (Multi-MNIST-Sudoku) and inferring crystal structures of materials from X-ray diffraction data (Crystal-Structure Phase-Mapping). All the experiments are performed on one NVIDIA Tesla V100 GPU with 16GB memory. We demonstrate the potential of DRNets on two de-mixing tasks with detailed experimental results. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Cornell University, Ithaca, New York, USA 2California Institute of Technology, Pasadena, California, USA. Correspondence to: Di Chen <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Constraint-aware stochastic gradient descent optimization of deep reasoning networks. Input: (i) Data points {xi}N i=1. (ii) Constraint graph. (iii) Penalty functions ψl( ) and ψg j ( ) for the local and the global constraints. (iv) Pre-trained or parametric generative decoder G( ). 1: Initialize the penalty weights λl, λg j and thresholds for all constraints. 2: for number of optimization iterations do 3: Batch data points {x1, ..., xm} from the randomly sampled (maximal) connected components. 4: Collect the global penalty functions {ψg j ( )}M j=1 concerning those data points. 5: Compute the latent space {φθ(x1), ..., φθ(xm)} from the encoder. 6: Adjust the penalty weights λl, λg j and thresholds accordingly. 7: minimize 1 m Pm i=1 L(G(φθ(xi)), xi) + λlψl(φθ(xi)) + PM j=1 λg jψg j ({φθ(xk)|k Sj}) using any standard gradient-based optimization method and update the parameters θ. 8: end for |
| Open Source Code | No | The paper does not provide an explicit statement or a link indicating the availability of open-source code for the described methodology. |
| Open Datasets | Yes | Multi-MNIST-Sudoku: We generated 160,000 input data points for each training set, validation set and test set, where each data point corresponds to a 32x32 image of overlapping digits coming from MNIST (Le Cun et al., 1998) and every 16 data points form a 4-by-4 overlapping Sudokus. |
| Dataset Splits | Yes | We generated 160,000 input data points for each training set, validation set and test set, where each data point corresponds to a 32x32 image of overlapping digits coming from MNIST (Le Cun et al., 1998) and every 16 data points form a 4-by-4 overlapping Sudokus. |
| Hardware Specification | Yes | All the experiments are performed on one NVIDIA Tesla V100 GPU with 16GB memory. |
| Software Dependencies | No | The paper mentions using Adam optimizer but does not provide specific version numbers for software dependencies like deep learning frameworks or libraries. |
| Experiment Setup | Yes | For the training process of our DRNets, we select a learning rate in {0.0001, 0.0005, 0.001} with Adam optimizer (Kingma & Ba, 2014), for all the experiments. [...] The reasoning loss enforces the Sudoku rules and includes the continuous relaxation of the cardinality (2 16 cells) and All-Different (2 (4 rows + 4 columns + 4 boxes)) constraints for every 16 data points, with initial weights of 0.01 and 1.0, respectively. [...] In this task, we used the Jensen Shannon distance (JS distance) with a weight of 20.0 plus the L2-distance with a weight of 0.05 as the reconstruction loss. We use the JS distance since the location of peaks are the most important characteristics of a phase pattern and mismatching peaks would cause a large JS distance. [...] Due to the different noise level, we use different weights for Gibbs Rule (1.0 and 30.0) and Phase Field Connectivity (0.01 and 3.0) for Al-Li-Fe oxide system and Bi-Cu-V oxide system respectively. |