reproducibilityindex.ai

IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs

Authors: Yuzhen Mao, Martin Ester, Ke Li

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on CPUs on the LRA (Tay et al., 2020), Zero SCROLLS (Shaham et al., 2023), and Long Eval (Li et al., 2023) benchmarks. Across all three benchmarks, Ice Former demonstrates substantially faster inference speeds than existing methods while attaining almost no accuracy loss compared to the Transformer.
Researcher Affiliation	Academia	Yuzhen Mao, Martin Ester, Ke Li School of Computing Science, Simon Fraser University Burnaby, BC V5A 1S6, Canada {yuzhenm,ester,keli}@sfu.ca
Pseudocode	Yes	A PSEUDOCODE FOR ICEFORMER
Open Source Code	Yes	The code is available on our project website at https://yuzhenmao.github.io/Ice Former/.
Open Datasets	Yes	We conduct experiments on CPUs on the LRA (Tay et al., 2020), Zero SCROLLS (Shaham et al., 2023), and Long Eval (Li et al., 2023) benchmarks.
Dataset Splits	Yes	In this experiment, we follow the train/test splits from Tay et al. (2020) and report the test dataset classification accuracy, average running time of the attention module, and CPU memory usage during inference for each task.
Hardware Specification	Yes	To ensure robustness of results, we used a variety of CPUs for our experiments we used Intel(R) Core(TM) i7-6850K 6-Core for the LRA experiments, AMD Ryzen 9 5950X 16-Core for the Zero SCROLLS experiments, and AMD Ryzen 9 5900X 12-Core for the Long Eval experiments.
Software Dependencies	No	The paper mentions "Py Torch" but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	For LARA and Nyströmformer, we tuned the parameter num_landmarks by optimizing over the range {64, 128, 256, 512, 1024}. For H-Transformer-1D, we tuned the parameter block_size by optimizing over the range {64, 128, 256, 512, 1024}. For Reformer, we tuned the parameters num_hash and bucket_size: we considered the values of num_hash in range {1, 2, 4} and the values of bucket_size in range {64, 128, 256, 512, 1024}.For Ice Former, we tuned the parameter top_k over the range {3, 5, 8, 10, 15, 20}.