M3DRef-CLIP - Detail - Visual Language Perception

Method Detail: M3DRef-CLIP

Benchmark:	VLMOD
Short name:	M3DRef-CLIP
Long name:	M3DRef-CLIP
Description:	@inproceedings{zhang2023multi3drefer, title={Multi3drefer: Grounding text description to multiple 3d objects}, author={Zhang, Yiming and Gong, ZeMing and Chang, Angel X}, booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision}, pages={15225--15236}, year={2023} }
Reference:	Zhang, Yiming, Gong, ZeMing, Chang, Angel X, Multi3drefer: Grounding text description to multiple 3d objects. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.
Last submitted:	August 31, 2025
Published:	August 31, 2025 at 11:26:22
Submissions:	1
Project page / code:	N/A
Open source:	No

Submission Date	F1 (↑)	Precision (↑)	Recall (↑)	TP (↑)	FP (↓)	FN (↓)
2025-08-31 11:26	63.7000	49.8200	88.3000	55954	56364	7416