Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Deep Tree Tensor Networks

Authors: Chang Nie

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The proposed model is evaluated across multiple benchmarks and domains, demonstrating superior performance compared to both peer methods and state-of-the-art architectures. Our code is publicly available at https://github.com/Nie Cha/deep_tree_tensor_network. ... In this section, we provide a comprehensive assessment of the DTTN s effectiveness. Specifically, in Section 4.1, we perform experiments on a series of image classification benchmarks to validate the model s superiority over other multilinear and TN architectures. In Section 4.2, we show the broader impact of the DTTN across other domains, including recommendation system and partial differential equation (PDE) solving. Section 4.3 presents ablation studies aimed at examining the impact of various design choices.
Researcher Affiliation	Academia	Chang Nie Nanjing University of Science and Technology Nanjing, China EMAIL
Pseudocode	Yes	In Algorithm 1, we provide a PyTorch-style implementation of AIM.
Open Source Code	Yes	Our code is publicly available at https://github.com/Nie Cha/deep_tree_tensor_network.
Open Datasets	Yes	A series of benchmarks with different types, scales, and resolutions are employed for the experiments, including CIFAR-10 [30], Tiny Image Net [31], Image Net-100 [58], Image Net-1K [46], MNIST, and Fashion-MNIST [56]. A detailed description of the benchmarks and training configurations is included in the supplement.
Dataset Splits	Yes	CIFAR-10 [30] consists of 60K color images of 32 32 resolution across 10 classes, with 50K images for training and 10K for testing. ... MNIST and Fashion-MNIST [56] both contain 60,000 grayscale images for training and 10,000 for validation, with each image sized at 28 28.
Hardware Specification	Yes	All experiments were conducted on 8 GeForce RTX 3090 GPUs using native PyTorch. ... Table 10 presents a detailed comparison of inference latency and memory usage, benchmarked on an NVIDIA A6000 GPU.
Software Dependencies	No	All experiments were conducted on 8 GeForce RTX 3090 GPUs using native PyTorch. ... We utilize the timm library6 and ensure all settings are aligned with the comparison method MONet [7].
Experiment Setup	Yes	Table 11: Training settings for Image Net-1K in Section 4.1. Item Setting: Optimizer Adam W, Base learning rate 1e-3, Warmup-lr 1e-6, Learning rate schedule cosine, Weight Decay 0.01 & 0.02, Batch size 320 4 GPU, Label smoothing 0.1, Auto augmentation Random erase 0.1 Cutmix 0.5 Mixup 0.5 Dropout 0.0. ... We train our model using the SGD optimizer with a batch size of 128 for 160 epochs. The Multi Step LR strategy is applied to adjust the learning rate, and data augmentation settings are in accordance with [10].