Visual Localisation Using Deep Learning and Graph Neural Networks: Approaches and Evaluation
DOI:
https://doi.org/10.31224/5143Keywords:
Visual Localisation, Feature Engineering, Deep Learning, Graph Neural Network, Super Glue, CNN, autoencoder, 3D ConstructionAbstract
This paper addresses the visual localization problem, focusing on estimating camera position and orientation from images in a known scene. Traditional localization methods utilizing local feature matching face challenges in generalization to new scenarios. In contrast, this study explores state-of-the-art techniques, including deep learning models and graph neural networks, to enhance feature extraction and matching. We implemented five models: SIFT, a CNN-based baseline, Hierarchical Localisation, an ImageSimilarity-Autoencoder, and the SuperGlue feature matching model. Evaluated on a dataset from the Getty Center in Los Angeles for a Kaggle competition, the SuperGlue model significantly outperformed others, achieving a mean absolute error (MAE) of 6.37266. The findings suggest that leveraging advanced architectures and attention mechanisms can substantially improve visual localization performance, even under challenging conditions. This research highlights the potential of integrating deep learning and graph neural networks in practical localization tasks
Downloads
Downloads
Posted
License
Copyright (c) 2025 Dinesh Kumar Koilada

This work is licensed under a Creative Commons Attribution 4.0 International License.