Interpreting BERT Using LIME and SHAP

Manish Shukla

doi:10.31224/5078

##article.authors##

Manish Shukla NA

DOI:

https://doi.org/10.31224/5078

Keywords:

Artificial Intelligence, BERT, Explainable AI, LIME (Local Interpretable Model-Agnostic Explanations), SHAP (Shapley Additive Explanations), Interpretability, Classification

Abstract

Transformer-based language models such as BERT have achieved state-of-the-art performance on diverse natural language processing tasks, yet their decision processes remain opaque. This paper presents a comprehensive framework for interpreting BERT’s predictions in multi-label text classification using two leading model-agnostic explainability techniques—Local Interpretable Model-Agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP). An end-to-end pipeline for fine-tuning BERT and producing token-level attributions is introduced. We systematically compare the explainers with respect to local fidelity, global consistency, stability and computational cost. Experimental results suggest that LIME generates intuitive, case-specific explanations while SHAP provides theoretically grounded and globally consistent attributions. By integrating the complementary strengths of both methods, we propose a hybrid interpretation strategy that balances interpretability, scalability and accuracy. The methodology is illustrated through a case study on multi-label genre classification from movie plot summaries. Detailed guidelines and synthetic visualisations are provided to enable practitioners to apply these techniques effectively and responsibly.

Downloads

Download data is not yet available.

Interpreting BERT Using LIME and SHAP

##article.authors##

DOI:

Keywords:

Abstract

Downloads

Downloads

Posted

License

Latest preprints