Meta-Learning Enhanced Sketch-Based Image Retrieval for Improved Generalization across Diverse Domains
Main Article Content
Abstract
Introduction: Sketch-Based Image Retrieval (SBIR) is a task that involves retrieving natural images using freehand sketches as queries. This task presents significant challenges due to the substantial visual domain gap between abstract sketches and detailed photos, as well as the high variability among different sketches of the same object. Although deep learning techniques have advanced SBIR performance, they often rely heavily on large amounts of category-specific paired data and show limited ability to generalise to unseen categories.
Objectives: This study aims to develop a robust and generalisable SBIR framework that can effectively perform in low-data regimes, such as few-shot and zero-shot scenarios, where traditional deep learning models often fail.
Methods: We propose Meta-SBIR, a novel meta-learning-based SBIR framework. The approach focuses on learning either a model initialisation that quickly adapts to new tasks or a generalisable metric space that performs well across various SBIR tasks. By leveraging meta-learning principles, our method is trained to transfer knowledge across different sketch-photo retrieval scenarios, thereby improving its ability to handle novel and diverse categories with minimal data.
Results: We evaluate the proposed Meta-SBIR framework on two challenging benchmark datasets: the fine-grained Sketchy dataset and the more abstract TU-Berlin dataset, along with a potential generic 'Abstract' category. Experimental results show that Meta-SBIR significantly outperforms traditional deep learning baselines and conventional fine-tuning strategies. In particular, it demonstrates higher retrieval accuracy measured by mean Average Precision (mAP) and Precision@k, especially in few-shot settings.
Conclusions: The Meta-SBIR framework effectively addresses the limitations of existing SBIR methods by enhancing generalisation and retrieval performance in low-data regimes. This approach shows strong potential for real-world applications where sketch queries vary greatly and annotated data is scarce.