Preprint / Version 1

Explainable Multimodal Deep Learning Framework for Dental Disease Diagnosis

##article.authors##

DOI:

https://doi.org/10.31224/6984

Keywords:

Deep Learning, Multi-modal Learning, Explainable AI, ResNet, BERT, Attention Mechanism, Dental Disease Diagnosis

Abstract

Early and accurate diagnosis of dental diseases is essential for preventing disease progression and improving patient outcomes. This paper proposes an explainable multimodal deep learning framework that integrates intraoral RGB images and patient-reported symptom descriptions for automated dental disease diagnosis. The framework combines a convolutional neural network (ResNet) for visual feature extraction and a transformer-based model (BERT) for contextual understanding of symptoms. A cross-modal attention-based fusion mechanism is employed to effectively integrate image and text representations, enabling more robust and reliable predictions.

To enhance clinical interpretability, the system incorporates Grad-CAM for visual explanations and attention-based textual attribution for symptom-level reasoning. Experimental results demonstrate that the proposed multimodal model achieves an accuracy of 97%, outperforming both image-only and text-only approaches. Overall, the proposed framework provides a scalable, low-cost, and explainable solution for clinical decision support and early dental disease screening.

Downloads

Download data is not yet available.

Author Biography

Jayani Malsha Katugampala Kankanamalage, Student

Undergraduate Student, Department of Computer Science, Informatics Institute of Technology (IIT), affiliated with the University of Westminster

Downloads

Posted

2026-05-04