Tactile DreamFusion: Exploiting Tactile Sensing for 3D Generation
Authors: Ruihan Gao, Kangle Deng, Gengshan Yang, Wenzhen Yuan, Jun-Yan Zhu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method in both text-to-3D and image-to-3D settings. Our experiments demonstrate that our method provides customized and realistic fine geometric textures while maintaining accurate alignment between two modalities of vision and touch. We present comprehensive experiments to verify the efficacy of our method. We perform qualitative and quantitative comparisons with existing baselines and ablation studies on the major components. |
| Researcher Affiliation | Academia | Ruihan Gao1 Kangle Deng1 Gengshan Yang1 Wenzhen Yuan2 Jun-Yan Zhu1 1Carnegie Mellon University 2University of Illinois Urbana-Champaign |
| Pseudocode | No | The paper describes the method in prose and mathematical equations but does not contain a clearly labeled pseudocode or algorithm block. |
| Open Source Code | Yes | Our code and datasets are available on our project website. https://ruihangao.github.io/Tactile Dream Fusion/ |
| Open Datasets | Yes | Figure 2 shows our dataset Touch Texture collected from 18 daily objects with diverse tactile textures. Our dataset is available on our project website. https://ruihangao.github.io/Tactile Dream Fusion/ |
| Dataset Splits | No | The paper mentions training, validation, and testing phases but does not explicitly provide specific percentages, sample counts, or detailed methodologies for dataset splits for reproduction. |
| Hardware Specification | Yes | We train all models on A6000 GPUs and each experiment takes about 10 mins and 20G RAM to run. |
| Software Dependencies | Yes | We follow Dream Booth [85] to train texture Lo RAs with Stable Diffusion (SD) V1.4 for the tactile guidance loss. ... We use Control Net (v1.1 normalbae version) with SD V1.5 for the diffusion loss. |
| Experiment Setup | Yes | To initiate our texture grid, we start by only optimizing the visual matching loss LVM and tactile matching loss LTM for 150 iterations with λVM = 500 and λTM = 1. After that, we run optimization for another 50 iterations to refine the output guided by diffusion priors. We reduce λTM from 1 to 0.05, change LVM from per-pixel error to mean-color error to allow more flexibility in texture refinement, and add the visual guidance loss LVG and tactile guidance loss LTG with λVG = 5 and λTG = 0.05. ...We train the texture field network using Adam optimizer with lr = 0.01. |