Qualitative analysis showcases O3D-SIM's open-set capability, identifying objects like wheelchairs and mannequins unseen by closed-set methods.Qualitative analysis showcases O3D-SIM's open-set capability, identifying objects like wheelchairs and mannequins unseen by closed-set methods.

O3D-SIM Visualization: Accurately Identifying Unseen Objects and Instances

Abstract and 1 Introduction

  1. Related Works

    2.1. Vision-and-Language Navigation

    2.2. Semantic Scene Understanding and Instance Segmentation

    2.3. 3D Scene Reconstruction

  2. Methodology

    3.1. Data Collection

    3.2. Open-set Semantic Information from Images

    3.3. Creating the Open-set 3D Representation

    3.4. Language-Guided Navigation

  3. Experiments

    4.1. Quantitative Evaluation

    4.2. Qualitative Results

  4. Conclusion and Future Work, Disclosure statement, and References

4.2. Qualitative Results

To qualitatively demonstrate the effectiveness of the proposed O3D-SIM, this section includes visualizations of the model’s performance using select examples. These visualizations are displayed in Figure 5, illustrating the outcomes for two mapping sequences.

\ Notably, the open-set capability of O3D-SIM enables the identification of objects that are typically undetectable by conventional pipelines relying on closed sets or predefined datasets, such as wheelchairs. The figure showcases a comparative analysis of our pipeline’s ability to accurately identify and segment various objects, including mannequins and mobile robots, against their actual counts. This comparison highlights situations where traditional methods, constrained by a limited set of recognizable objects, fall short.

\ Our approach excels in recognizing instance-level semantics, accurately identifying 5 out of 6 table instances (with one false positive), underscoring its precision across both simulated and real-world data. This demonstrates the robustness of our pipeline, further evidenced by the clarity of the semantic map and the ease with which instance-level segmentation results can be visualized. While methods like VLMaps might identify a broader range of objects due to their open-set nature, and SI-Maps may detect multiple instances of the same object, O3D-SIM uniquely excels at both, offering a comprehensive solution.

\

:::info Authors:

(1) Laksh Nanwani, International Institute of Information Technology, Hyderabad, India; this author contributed equally to this work;

(2) Kumaraditya Gupta, International Institute of Information Technology, Hyderabad, India;

(3) Aditya Mathur, International Institute of Information Technology, Hyderabad, India; this author contributed equally to this work;

(4) Swayam Agrawal, International Institute of Information Technology, Hyderabad, India;

(5) A.H. Abdul Hafez, Hasan Kalyoncu University, Sahinbey, Gaziantep, Turkey;

(6) K. Madhava Krishna, International Institute of Information Technology, Hyderabad, India.

:::


:::info This paper is available on arxiv under CC by-SA 4.0 Deed (Attribution-Sharealike 4.0 International) license.

:::

\

Market Opportunity
OpenLedger Logo
OpenLedger Price(OPEN)
$0.18174
$0.18174$0.18174
-0.61%
USD
OpenLedger (OPEN) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.