IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs

Authors: Yuzhen Mao, Martin Ester, Ke Li

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on CPUs on the LRA (Tay et al., 2020), Zero SCROLLS (Shaham et al., 2023), and Long Eval (Li et al., 2023) benchmarks. Across all three benchmarks, Ice Former demonstrates substantially faster inference speeds than existing methods while attaining almost no accuracy loss compared to the Transformer.
Researcher Affiliation Academia Yuzhen Mao, Martin Ester, Ke Li School of Computing Science, Simon Fraser University Burnaby, BC V5A 1S6, Canada {yuzhenm,ester,keli}@sfu.ca
Pseudocode Yes A PSEUDOCODE FOR ICEFORMER
Open Source Code Yes The code is available on our project website at https://yuzhenmao.github.io/Ice Former/.
Open Datasets Yes We conduct experiments on CPUs on the LRA (Tay et al., 2020), Zero SCROLLS (Shaham et al., 2023), and Long Eval (Li et al., 2023) benchmarks.
Dataset Splits Yes In this experiment, we follow the train/test splits from Tay et al. (2020) and report the test dataset classification accuracy, average running time of the attention module, and CPU memory usage during inference for each task.
Hardware Specification Yes To ensure robustness of results, we used a variety of CPUs for our experiments we used Intel(R) Core(TM) i7-6850K 6-Core for the LRA experiments, AMD Ryzen 9 5950X 16-Core for the Zero SCROLLS experiments, and AMD Ryzen 9 5900X 12-Core for the Long Eval experiments.
Software Dependencies No The paper mentions "Py Torch" but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes For LARA and Nyströmformer, we tuned the parameter num_landmarks by optimizing over the range {64, 128, 256, 512, 1024}. For H-Transformer-1D, we tuned the parameter block_size by optimizing over the range {64, 128, 256, 512, 1024}. For Reformer, we tuned the parameters num_hash and bucket_size: we considered the values of num_hash in range {1, 2, 4} and the values of bucket_size in range {64, 128, 256, 512, 1024}.For Ice Former, we tuned the parameter top_k over the range {3, 5, 8, 10, 15, 20}.