The Language-Guided Navigation module leverages an LLM (like ChatGPT) and the open-set O3D-SIM.The Language-Guided Navigation module leverages an LLM (like ChatGPT) and the open-set O3D-SIM.

VLN: LLM and CLIP for Instance-Specific Navigation on 3D Maps

Abstract and 1 Introduction

  1. Related Works

    2.1. Vision-and-Language Navigation

    2.2. Semantic Scene Understanding and Instance Segmentation

    2.3. 3D Scene Reconstruction

  2. Methodology

    3.1. Data Collection

    3.2. Open-set Semantic Information from Images

    3.3. Creating the Open-set 3D Representation

    3.4. Language-Guided Navigation

  3. Experiments

    4.1. Quantitative Evaluation

    4.2. Qualitative Results

  4. Conclusion and Future Work, Disclosure statement, and References

3.4. Language-Guided Navigation

In this section, we leverage the LLM-based approach from [1], which uses ChatGPT [35] to understand and map language commands to pre-defined function primitives that the robot can understand and execute. However, there are a few differences between our current approach and the approach in [1] regarding the use case of the LLM and the implementation of our function primitives. The previous approach used the LLM’s ability to bring in an open-set understanding by mapping general queries to the already-known closed-set class labels obtained via Mask2Former [7].

\ However, given the open-set nature of our new representation, O3D-SIM, the LLM does not need to do that. Figure 4 shows both approaches’ code output differences. The function primitives work similarly to the older approach, requiring the desired object type and its instance as an input. But now, the desired object is not from a pre-defined set of classes but a small query defining the object, so the implementation to find the desired location changes. We use the text and image-aligned nature of CLIP embeddings to find the desired object, where the input description is passed to the model, and its corresponding embedding is used to find the object in O3D-SIM.

\ A cosine similarity is calculated between the embedding of the description and all the embeddings of our representation. These are ranked in a decreasing order, and the desired instance is selected. Once the instance is finalized, a goal corresponding to this instance is generated and passed to the navigation stack for autonomous navigation of the robot, hence achieving Language-Guided Navigation.

\

:::info Authors:

(1) Laksh Nanwani, International Institute of Information Technology, Hyderabad, India; this author contributed equally to this work;

(2) Kumaraditya Gupta, International Institute of Information Technology, Hyderabad, India;

(3) Aditya Mathur, International Institute of Information Technology, Hyderabad, India; this author contributed equally to this work;

(4) Swayam Agrawal, International Institute of Information Technology, Hyderabad, India;

(5) A.H. Abdul Hafez, Hasan Kalyoncu University, Sahinbey, Gaziantep, Turkey;

(6) K. Madhava Krishna, International Institute of Information Technology, Hyderabad, India.

:::


:::info This paper is available on arxiv under CC by-SA 4.0 Deed (Attribution-Sharealike 4.0 International) license.

:::

\

Market Opportunity
Large Language Model Logo
Large Language Model Price(LLM)
$0,0003516
$0,0003516$0,0003516
+%6,48
USD
Large Language Model (LLM) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Shiba Inu (SHIB) vs Little Pepe (LILPEPE): Which Meme Coin Will Take the Crown from Dogecoin (DOGE)?

Shiba Inu (SHIB) vs Little Pepe (LILPEPE): Which Meme Coin Will Take the Crown from Dogecoin (DOGE)?

The post Shiba Inu (SHIB) vs Little Pepe (LILPEPE): Which Meme Coin Will Take the Crown from Dogecoin (DOGE)? appeared on BitcoinEthereumNews.com. Dogecoin has been the face of meme coins for a long time. From Elon Musk tweets to a robust community, DOGE has managed to stay alive. But in 2025, things appear slightly different. Will Shiba Inu keep pursuing Dogecoin, or will new contender Little Pepe pass them both by? Dogecoin (DOGE): Still the Benchmark Dogecoin is trading just above $0.2452, up 10.63% over the past week. That steady climb shows why DOGE still matters: it has the liquidity, the listings, and the recognition that few meme tokens can match. Analysts see its price grinding higher into year-end, supported by altcoin momentum and ETF launches in the U.S. But here’s the thing: DOGE is no longer a scrappy underdog. With a market cap already in the tens of billions, turning $100 into $10,000 here is nearly impossible. It’s the Bitcoin of meme coins: reliable, liquid, and still iconic, but its days of 1,000× gains are behind it. Shiba Inu (SHIB): Big Name, Slowing Engine Shiba Inu sits at $0.00001349 with a market cap of $7.6 billion. It’s clawed back momentum with a 3.98% monthly surge, and analysts project a further 9.26% weekly gain to $0.00001418. Token burns and the expansion of Shibarium, its Layer-2 solution, keep the ecosystem alive. That said, SHIB’s size is also its weakness. Even with whales accumulating another 62 billion tokens, growth projections hover in the 400%–500% range, which is impressive but pales in comparison to what early buyers saw in 2021. SHIB is in the odd position of being too big to vanish, but too large to repeat its breakout magic. Little Pepe (LILPEPE): The New Challenger SHIB grew on pure hype, but LILPEPE comes with real infrastructure. The project is building an Ethereum-compatible Layer-2 network designed for meme tokens, with near-zero fees, sniper-bot resistance, and…
Share
BitcoinEthereumNews2025/10/04 23:32
Kodiak Sciences Announces Pricing of Upsized Public Offering of Common Stock

Kodiak Sciences Announces Pricing of Upsized Public Offering of Common Stock

PALO ALTO, Calif., Dec. 16, 2025 /PRNewswire/ — Kodiak Sciences Inc. (Nasdaq: KOD), a precommercial retina focused biotechnology company committed to researching
Share
AI Journal2025/12/17 12:15
Oil jumps over 1% on Venezuela oil blockade

Oil jumps over 1% on Venezuela oil blockade

Oil prices rose more than 1 percent on Wednesday after US President Donald Trump ordered “a total and complete” blockade of all sanctioned oil tankers entering
Share
Agbi2025/12/17 11:55