GeoLLM: Comparative Study and Enhancement of Large Language Models for Geospatial Knowledge Using Retrieval Augmented Generation(RAG)

Serah Akojenu; Olubayo Adekanmbi; Anthony Soronnadi

doi:10.24963/ijcai.aai4g.2024/2

GeoLLM: Comparative Study and Enhancement of Large Language Models for Geospatial Knowledge Using Retrieval Augmented Generation(RAG)

Serah Akojenu, Olubayo Adekanmbi, Anthony Soronnadi

Proceedings of the Second IJCAI AI for Good Symposium in Africa
hosted by Deep Learning Indaba

Pages 15-18. https://doi.org/10.24963/ijcai.aai4g.2024/2

PDF BibTeX

In recent years, large language models (LLMs) have demonstrated exceptional ability in natural language processing tasks, yet their potential to enhance geospatial intelligence remains largely untapped. Geospatial data, which is found not only in maps but also in regular text, presents unique opportunities for the application of LLMs. This research seeks to investigate if Large Language Models can serve as accurate sources of geospatial information. To achieve this, we assessed the geospatial capability of 6 open LLMs: Mistral, GPT 3.5 and 4.o, LLaMA 2 and 3, and Gemini, using 30 curated geospatial prompts grouped into three categories: geographic coordinates, basic spatial calculations, and descriptions of places. These prompts were translated into Yoruba to ensure contextual relevance. Subsequently, we employed retrieval-augmented generation (RAG) to enhance the models’ geospatial knowledge base. Our findings revealed that GPT 4.o and Mistral consistently outperformed the other models across all categories and languages, while the RAG drastically improved the LLMs’ knowledge. This study demonstrates the potential of LLMs in being an accurate source of geospatial information through the use of retrieval augmented generation (RAG).