LLaNA: Large Language and NeRF Assistant

Authors: Andrea Amaduzzi, Pierluigi Zama Ramirez, Giuseppe Lisanti, Samuele Salti, Luigi Di Stefano

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Results show that processing Ne RF weights performs favourably against extracting 2D or 3D representations from Ne RFs.
Researcher Affiliation Academia CVLAB, University of Bologna
Pseudocode No The paper describes the methodology but does not include any pseudocode or algorithm blocks.
Open Source Code No Our newly introduced benchmark, the source code and the weights for all our models will be publicly released in case of acceptance.
Open Datasets Yes We also introduce a new Ne RF language dataset, that we will make publicly available, to train LLa NA and test the capabilities of our assistant. ... It features paired Ne RFs and language annotations for Shape Net objects [8], in particular for all the 40K Ne RFs available in the nf2vec dataset [61].
Dataset Splits Yes Shape Ne RF Text provides 30939, 3846 and 3859 objects for the train, validation and test sets, respectively.
Hardware Specification Yes Our model is implemented in Py Torch and trained on 4 NVIDIA A100 with 64GB of VRAM each.
Software Dependencies No The paper mentions software like PyTorch and Nerf Acc, and pre-trained models like LLa MA 2 and LLa VA2-13b, but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes We optimize the projector weights and the embeddings for 3 epochs with a learning rate of 0.002 and batch size of 64. ... For this phase, we employ a learning rate of 0.0002 and a batch size of 16.