Hybrid Semantic Retrieval: Augmenting Weighted TF-IDF with BERT for Enhanced Question Answering
DOI:
https://doi.org/10.31224/5341Keywords:
Hybrid Semantic Retrieval, Weighted TF-IDF, BERT Embeddings, Question Answering, Information Retrieval, Semantic Search, NLPAbstract
This paper introduces a refined semantic search paradigm that significantly improves the precision and relevance of information retrieval, particularly within question-answering systems. Our novel approach integrates a meticulously designed weighted TF-IDF scheme with the contextual understanding capabilities of the BERT natural language model. By intuitively emphasizing "questionable spans" in documents via the weighted TF-IDF and simultaneously leveraging BERT to capture nuanced semantic meanings, our model effectively bridges the gap left by traditional lexical methods. We demonstrate through rigorous experiments on question-answering datasets that this hybrid strategy substantially outperforms existing semantic search techniques. The proposed model is designed for efficient scaling across large datasets, marking a considerable advancement in developing highly performant and semantically aware search engines for complex information landscapes.
Downloads
Downloads
Posted
License
Copyright (c) 2025 Dinesh Kumar Koilada

This work is licensed under a Creative Commons Attribution 4.0 International License.