Preprint / Version 1

Hybrid Semantic Retrieval: Augmenting Weighted TF-IDF with BERT for Enhanced Question Answering

##article.authors##

  • Dinesh Kumar Koilada JNTU Hyderabad

DOI:

https://doi.org/10.31224/5341

Keywords:

Hybrid Semantic Retrieval, Weighted TF-IDF, BERT Embeddings, Question Answering, Information Retrieval, Semantic Search, NLP

Abstract

This paper introduces a refined semantic search paradigm that significantly improves the precision and relevance of information retrieval, particularly within question-answering systems. Our novel approach integrates a meticulously designed weighted TF-IDF scheme with the contextual understanding capabilities of the BERT natural language model. By intuitively emphasizing "questionable spans" in documents via the weighted TF-IDF and simultaneously leveraging BERT to capture nuanced semantic meanings, our model effectively bridges the gap left by traditional lexical methods. We demonstrate through rigorous experiments on question-answering datasets that this hybrid strategy substantially outperforms existing semantic search techniques. The proposed model is designed for efficient scaling across large datasets, marking a considerable advancement in developing highly performant and semantically aware search engines for complex information landscapes.

Downloads

Download data is not yet available.

Downloads

Posted

2025-09-10