reproducibilityindex.ai

Efficient Compact Bilinear Pooling via Kronecker Product

Authors: Tan Yu, Yunfeng Cai, Ping Li3170-3178

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Systematic experiments on four public benchmarks using two backbones demonstrate the efﬁciency and effectiveness of the proposed method in ﬁne-grained recognition.
Researcher Affiliation	Industry	Tan Yu, Yunfeng Cai, Ping Li Cognitive Computing Lab Baidu Research 10900 NE 8th St. Bellevue, Washington 98004, USA No.10 Xibeiwang East Road, Beijing 100193, China {tanyu01, caiyunfeng, liping11}@baidu.com
Pseudocode	Yes	Algorithm 1: Tensor Modal Product 1: Input: r, X Rd N, b A R a r d r . 2: Output: T = [Ir b A]X. 3: Reshape X into a tensor X RN d r r. 4: Perform modal product T = X 2 b A 3 Ir. 5: Unfold the tensor T along mode-1, and set T = T (1).
Open Source Code	No	The paper does not contain any explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets	Yes	We conduct experiments on four public benchmarks for ﬁne-grained recognition including FGVC-Aircraft (AIR) (Maji et al. 2013), CUB-200-2011 (CUB) (Wah et al. 2011), MIT scene dataset (Quattoni and Torralba 2009), and Describable Texture Dataset (DTD) (Cimpoi et al. 2014).
Dataset Splits	No	The paper mentions using 'public benchmarks' but does not explicitly specify the training/validation/test dataset splits (e.g., percentages or sample counts) within the text.
Hardware Specification	Yes	The experiments are conducted on a single NVIDIA Titan X (Pascal) GPU card.
Software Dependencies	No	The paper states that the method is 'implemented in Paddle Paddle platform' but does not provide specific version numbers for PaddlePaddle or any other software dependencies.
Experiment Setup	Yes	We adopt a two-phase training. In the ﬁrst phase, we only update parameters in TKPF and classiﬁer layers. In the second phase, we ﬁne-tune parameters of all layers. Each image is resized into 448 448... By default, we set a = b = 96, that is, D = 962. We set r = 32 by default when using VGG16 backbone. Considering both effectiveness and efﬁciency, we set Q = 2 by default.