Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Geometric Mixture Models for Electrolyte Conductivity Prediction

Authors: Anyi Li, Jiacheng Cen, Songyou Li, Mingze Li, YANG YU, Wenbing Huang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments demonstrate that Geo Mix consistently outperforms diverse baselines (including MLPs, GNNs, and geometric GNNs) across both datasets, validating the importance of cross-molecular geometric interactions and equivariant message passing for accurate property prediction. This work not only establishes new benchmarks for electrolyte research but also provides a general geometric learning framework that advances modeling of mixture systems in energy materials, pharmaceutical development, and beyond.
Researcher Affiliation Collaboration Anyi Li1 2 3, Jiacheng Cen1 2 3, Songyou Li1 2 3, Mingze Li1 2 3, Yang Yu4 , Wenbing Huang1 2 3 1 Gaoling School of Artificial Intelligence, Renmin University of China 2 Beijing Key Laboratory of Research on Large Models and Intelligent Governance 3 Engineering Research Center of Next-Generation Intelligent Search and Recommendation, MOE 4 Hisun Pharm EMAIL; EMAIL; EMAIL;
Pseudocode No The paper describes the model architecture and its components (GIN) in text and through a diagram (Figure 1), but it does not include a distinct pseudocode block or algorithm section.
Open Source Code Yes Our code and dataset are available at Github2. https://github.com/GLAD-RUC/Geo Mix
Open Datasets Yes We curate and standardize two public datasets for electrolyte conductivity prediction: CALi Sol [19] and Diff Mix [11]. The datasets are carefully processed with geometric graph construction, forming a benchmark for electrolyte conductivity prediction.
Dataset Splits Yes For both CALi Sol and Diff Mix datasets, we adopt a standard random split of 70%/20%/10% into train, validation, and test sets, respectively. Splitting is performed independently for CALi Sol and Diff Mix with a fixed seed to ensure reproducibility. No electrolyte formulation is shared across different subsets.
Hardware Specification Yes We run our methods mainly on A100 80GB GPUs.
Software Dependencies No The paper describes the models used (e.g., EGNN, TFN, Adam optimizer) but does not provide specific version numbers for software dependencies like Python, PyTorch, or other libraries. For example, it states: "All models are trained using the Adam optimizer with a learning rate of 5  10 5" and "We employ either EGNN [54] or TFN [55] as equivariant encoders" but without versions of the implementations.
Experiment Setup Yes All models are trained using the Adam optimizer with a learning rate of 5  10 5, weight decay of 1  10 12, and a maximum of 500 training epochs. The default batch size is 1024, except for Geo Mix models, which use a reduced batch size of 128 due to GPU memory constraints. We fix the random seed to 7 for all experiments to ensure reproducibility. EGNN-att: This variant employs a 4-layer EGNN backbone with 64 hidden dimensions per layer. The attention module takes 8 molecular embeddings as input, with 4 attention heads and 3 stacked attention layers. The resulting representation is concatenated with temperature T and salt concentration c, and then passed through a 3-layer MLP for prediction. Geo Mix-EGNN: Geo Mix-EGNN adopts a multi-channel EGNN as the backbone, with 8 equivariant channels. Other hyperparameters follow the EGNN-att setting. The GIN consists of 3 layers. The Noisy Nodes loss [57] is applied with hyperparameter ΜΈ = 128.