Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ZIN: When and How to Learn Invariance Without Environment Partition?

Authors: Yong Lin, Shengyu Zhu, Lu Tan, Peng Cui

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on both synthetic and real world datasets validate our analysis and demonstrate an improved performance of the proposed framework.
Researcher Affiliation Collaboration Yong Lin HKUST EMAIL Shengyu Zhu Huawei Noah s Ark Lab EMAIL Lu Tan Tsinghua University EMAIL Peng Cui Tsinghua University EMAIL
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes Codes are available at https://github.com/linyongver/ZIN_official.
Open Datasets Yes House Price Prediction. This experiment considers a real world regression dataset of house sales prices from Kaggle (https://www.kaggle.com/c/house-prices-advanced-regression-techniques).
Dataset Splits Yes The dataset is split according to the built year, with training dataset in period [1900, 1950] and test dataset in period (1950, 2000]. We normalize the prices of the houses with the same built year, and our target is to predict the normalized price. We choose the built year as auxiliary information for ZIN. As there is no well-defined ground-truth partition for IRM, we manually split the training dataset equally into 5 segments with 10-year range in each segment.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments were provided.
Software Dependencies No The paper mentions using 'Adam optimizer [23]' and implementing models as 'MLP' or 'ResNet-18 [16]' but does not provide specific version numbers for software libraries or frameworks like PyTorch, TensorFlow, or Python.
Experiment Setup Yes The number of inferred environments K is set to be 2 as default. ... For the hyperparameters, we use the Adam optimizer [23] with default parameters and a learning rate of 1e-4. The batch size is set to 128 for all datasets.