3D Object Captioning

5 papers with code • 1 benchmarks • 1 datasets

3D object captioning involves generating a natural language description of an object, given its point cloud representation.

Libraries

Use these libraries to find 3D Object Captioning models and implementations

Datasets


Most implemented papers

3D-LLM: Injecting the 3D World into Large Language Models

umass-foundation-model/3d-llm NeurIPS 2023

Furthermore, experiments on our held-in datasets for 3D captioning, task composition, and 3D-assisted dialogue show that our model outperforms 2D VLMs.

PointLLM: Empowering Large Language Models to Understand Point Clouds

openrobotlab/pointllm 31 Aug 2023

The unprecedented advancements in Large Language Models (LLMs) have shown a profound impact on natural language processing but are yet to fully embrace the realm of 3D understanding.

ShapeLLM: Universal 3D Object Understanding for Embodied Interaction

qizekun/ShapeLLM 27 Feb 2024

This paper presents ShapeLLM, the first 3D Multimodal Large Language Model (LLM) designed for embodied interaction, exploring a universal 3D object understanding with 3D point clouds and languages.

View Selection for 3D Captioning via Diffusion Ranking

crockwell/cap3d 11 Apr 2024

Scalable annotation approaches are crucial for constructing extensive 3D-text datasets, facilitating a broader range of applications.

MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors

tangyuan96/minigpt-3d 2 May 2024

Notably, MiniGPT-3D gains an 8. 12 increase on GPT-4 evaluation score for the challenging object captioning task compared to ShapeLLM-13B, while the latter costs 160 total GPU-hours on 8 A800.