Developing an Artificial Intelligence Software to Aid in Dental Caries Detection

Tabatabaei Tabrizi, Alireza; Ghasemi, Hadi; Panahandeh, Narges

Journal of Research in Dental

and Maxillofacial Sciences

Volume 11, Issue 1 (3-2026) J Res Dent Maxillofac Sci 2026, 11(1): 66-74 | Back to browse issues page

Ethics code: IR.SBMU.DRC.REC.1402.085

Mendeley

Zotero

RefWorks

Tabatabaei Tabrizi A, Ghasemi H, Panahandeh N. Developing an Artificial Intelligence Software to Aid in Dental Caries Detection. J Res Dent Maxillofac Sci 2026; 11 (1) :66-74
URL: http://jrdms.dentaliau.ac.ir/article-1-907-en.html

Developing an Artificial Intelligence Software to Aid in Dental Caries Detection

Alireza Tabatabaei Tabrizi¹

, Hadi Ghasemi²

, Narges Panahandeh ^*³

1- School of Dentistry, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
2- Department of Community Oral Health, School of Dentistry, Shahid Beheshti University of Medical Sciences, Tehran, Iran
3- Dental Research Center, Research Institute of Dental Sciences, School of Dentistry, Shahid Beheshti University of Medical Sciences, Tehran, Iran , nargespanahandeh@yahoo.com

Keywords: Dental Caries, Deep Learning, Convolutional Neural Networks, Machine Learning

Full-Text [PDF 584 kb] (88 Downloads) | Abstract (HTML) (63 Views)

Full-Text: (14 Views)

Abstract

Background and Aim: This study aimed to develop an artificial intelligence (AI) software with high diagnostic accuracy for detection of dental caries by evaluating different algorithms and machine learning models.
Materials and Methods: Totally of 1400 bitewing radiographs were retrieved from the archives of the Oral Radiology Department of Shahid Beheshti Dental School from April 2023 to March 2024 for machine learning using deep convolutional neural networks (CNNs). After resizing all images (100 x 100) in the form of an array that consisted of pixel values corresponding to the image, 1120 radiographs (80%) were randomly selected for training, and 280 radiographs (20%) were used for testing of the selected models. Two learning methods were employed, namely supervised learning (labeling by specialists) by classifying the bitewing radiographs into two classes (0: sound, 1: carious), and self-supervised learning by using VGG19, ResNet50, GoogleNet, and EfficientNet models. After implementation of the caries detection algorithms, the best model in terms of accuracy, sensitivity, and specificity was selected.
Results: Apart from reviewing and testing popular machine learning models in caries detection on a total of 1400 bitewing radiographs of patients, 242 bitewing radiographs were used for supervised learning after labeling, out of which, 210 were labeled as carious. EfficientNet model yielded the best results with a caries detection accuracy of 94.7%, precision of 93.4%, sensitivity of 96.7%, specificity of 96.2%, and F1-score of 93.5%.
Conclusion: The present results suggest that refining EfficientNet and its new versions can improve its performance in caries detection.
Keywords: Dental Caries; Deep Learning; Convolutional Neural Networks; Machine Learning

Introduction

Community oral health is the third of the 17 goals of the Sustainable Development Goals of the World Health Organization by 2030, which is to “ensure healthy lives and promote well-being for all at all ages”, and is therefore, highly important [1]. Thus, the World Health Organization’s global strategy for oral health asks for prioritizing oral health care, implementing minimally invasive procedures, and establishing mobile oral health services [2]. High prevalence of dental caries is a major public health concern worldwide. It is particularly common in developing countries [3]. Dental caries is among the most common chronic infectious diseases worldwide. Many diagnostic tests have been developed for detection of dental caries, enabling a faster intervention [4]. Dental clinicians usually need to perform several diagnostic tests for a precise and efficient treatment planning. Visual-tactile examination is usually performed as the initial diagnostic test, which is later supplemented and confirmed with additional tools. Dental radiography is often the first additional diagnostic modality requested after clinical examination [5]. Newer diagnostic tools, such as laser fluorescence and light fluorescence, can provide more accurate information about the carious lesions. Light fluorescence can detect the location and extent of lesions and bacterial activity [6]. Laser fluorescence diagnostic tools can detect demineralization and remineralization processes, and may be employed for detection of incipient carious lesions or assessment of the efficacy of remineralizing treatments. Such tools provide a numerical value and are therefore considered a quantitative test [7].
In addition to the abovementioned caries detection tools, artificial intelligence (AI) software programs with extensive availability, irrespective of time and location, are currently available for caries detection. In general, data science and AI have the potential to reinforce the next generation of oral and dental care services and serve as a meticulous, fast, and powerful assistant to the dental care providers. Several commercial products have been introduced for this purpose, such as Pearl, which is a novel tool for interpretation of dental radiographs with the help of AI, Overjet, as an innovative tool for caries detection, and Denti.AI for changing the diagnostic workup and management of dental caries [8].
If not properly diagnosed, dental caries can further invade the enamel, dentin, and pulp tissue and cause irreversible consequences. The diagnosis of dental caries is a debated topic. Although researchers have long been in search of tools with sufficient sensitivity and specificity for this purpose, evidence shows that the currently available, commonly used tools cannot detect caries in all tooth surfaces by themselves. Although the commonly used radiographic modalities have moderate sensitivity for caries detection, they are still the most frequently requested modality for this purpose [9,10]. The standard method of caries detection involves the use of a dental probe and inspection with the naked eye. However, naked eye inspection is only suitable for detection of extensive and visible carious lesions.
Evidence shows that AI and machine learning algorithms have been able to significantly decrease errors and maximize accuracy in medical diagnoses. The superiority of machine vision in deep learning with a 152-layer artificial neural network to human vision in ImageNet Large Scale Visual Recognition Challenge in 2015 and also in recent years points to the applications of machine learning and its growing popularity in the field of medicine [11-13].
Deep ANN technique is one of the new and widely used technologies for machine learning. Its dataset (images), algorithm, and kernel function produce valuable results in various applications. Considering all the above, this study aimed to develop an AI software to aid in dental caries detection with high accuracy, sensitivity, and specificity by assessing different models for implementation in Shahid Beheshti Dental School.

Materials and Methods

This study was approved by the Ethics Committee of Shahid Beheshti University of Medical Sciences (IR.SBMU.DRC.REC.1402.085). It also observed the principles of data security. A total of 1400 bitewing radiographs of patients retrieved from the archives of the Oral Radiology Department of Shahid Beheshti Dental School from April 2023 to November 2024 were anonymously used for this study. The patients were between 10 and 80 years. There were 48% males and 52% females. The 2D bitewing radiographs were collected from the Picture Archiving and Communication System of dental school and had been taken with 68 kV tube potential, 14 mA tube current, and an 8-second time. The type of tooth (incisors, molars, premolars) was not taken into account. Training and classification were performed only based on carious or sound status of the teeth according to the bitewing radiographs.
Although supervised and self-supervised learning are usually used for the same tasks and both require a ground truth to optimize performance through a loss function, self-supervised models are trained with non-labeled data.
In the present study, due to the shortage of the labeled data for supervised learning, and high costs of annotation (marking of carious areas on the radiographs) by specialists in the university setting, and also since supervised learning models may use a shortcut to learn the designed task, both supervised and self-supervised learning techniques were used [14-16]. The software was a diagnostic application that reported sensitivity, specificity, precision or positive predictive value (PPV), negative predictive value (NPV), accuracy, and F1-score for its testing [17]:

[18,19]
Datasets:
In this study, 1400 dental records of patients, including 1400 bitewing radiographs (1200 for training and 200 for testing), were used. This number was the minimum possible and available data for training the caries detection algorithm. Pre-training was initially performed before training to decrease the training time, improve efficiency, and prevent overfitting. Pretraining of the dataset included resizing, normalization, and data standardization by image rotation [20].
Data labeling:
For training, 265 bitewing radiographs showing proximal caries involving enamel and dentin, and cervical caries were labeled according to the consensus opinion of three specialists from the operative dentistry, oral radiology, and community dentistry departments with a minimum of 10 years of clinical and academic experience. Cases of disagreement were excluded. Resultantly, 242 bitewing radiographs were used for supervised learning; out of which, 210 were labeled as carious.
Image preprocessing and enhancement
Preprocessing of the images (data) included image reconstruction, noise reduction, artificial motion correction, standardization of size unit of images, standardization of dimension and orientation of images (by selecting images with the same orientation), resampling or removing the image, physical sizing (padding, cropping), resolution, standardization of pixel coordinates of images, and final resizing (100 x 100) of pixels with 3 channels with its mean RGB and grayscale value subtracted [21].
Architecture of the deep convolutional neural network (CNN):
In traditional machine learning, the first step was to select and extract proper features from a dataset by experts. Currently, the deep learning algorithms perform this task by themselves and then perform classification. These algorithms are designed to process data sets that are in the form of arrays, such as image sets. The CNN architecture is composed of multiple steps. CNNs are highly popular and efficient in image processing and computer vision. They are composed of three main layers: the convolution layer, the pooling layer, and the fully-connected layer. These layers are stacked in different ways to extract features, leading to variations in the CNN architectures, which are a property of deep learning [22-24]. A pre-trained deep learning CNN architecture was used in the present study to ensure higher performance. These architectural models included VGG19, ResNet50, GoogleNet, and EfficientNet, which recently showed high efficacy [25-27].
VGG19 model is an extensive model, which can correctly extract features from the collected data for diagnosis and classification of images. It performs better than other models when provided with large datasets with complex diagnostic tasks. VGG19 network is a deep network introduced in 2014. It is composed of 16 layers, 6 convolutional layers, and 6 integration layers [28, 29].
ResNet model is an improved version of CNN with a high number of CNNs. This model was presented by Wen L et al. [30]. ResNet model tries to solve saturation and precision reduction in deep learning process of CNN. ResNet50 is a residual network with 50 layers. Interlayer transfer deepens the network in ResNet models [31].
Also, GoogleNet, known as Inception, has several versions from V1 to V4, and constantly releases new versions. GoogleNet model is a new structure of deep CNN developed by Google in 2014. This architecture is a type of CNN extensively used for image detection and task classification. This module allows the model to extract features from different scales and positions. This model uses decreasing coefficients to decrease the number of filters in each layer, enabling faster calculation and reduction of the risk of over-fitting [32, 33].
EfficientNet model also uses CNN algorithm. Its architecture emphasizes on improvement of model efficacy and precision. EfficientNet is composed of 8 different models between B0 and B7. The model scales use three different parameters of depth, width, and resolution. Depth measures the depth of networks. Width indicates the number of neurons in each layer. Resolution describes the dataset according to which, the model is trained. Limited studies performed classification with EfficientNet [26,34,35]. Unlike the existing CNN models, it uses a novel activation function known as Swish instead of ReLU. EfficientNet also provides more efficient results compared with other advanced models by continuous scaling of depth, extent, and resolution while down-sizing the model. The first step in the hybrid scaling approach is to find a network that helps determining the relationship between different dimensions of the basic network scaling while working with fixed resource constraints. Thus, in this model, a suitable scale coefficient is calculated for depth, width, and resolution. Coefficients are then applied to scale the base network to the desired target network. In EfficientNet, feature maps are produced as the model output by consecutive convolutional layers, where each layer derives complex and abstract representations of the input data. In general, in EfficientNet, feature maps are produced through primary convolutional layers and MBConv blocks, and record different aspects of image data. Classification is performed after feature extraction [36, 37].
Statistical analysis and testing:
Testing environment:
Python programming language was used for this purpose. Keras and TensorFlow libraries were used to acquire the necessary skills for deep learning and CNNs. Python can be implemented on different platforms such as Jupyter, Spyder, PyCharm online, and Google Colab.
Since deep learning algorithms require a large computation and storage space, and the data collected from the Picture Archiving and Communication System of dental school were saved in Google Drive after labeling, and Google Colab can access Google Drive, Google Colab was used for this purpose in this study. When using different models, several hyper-parameters may be adjusted to optimize performance. The specific values of hyper-parameters may significantly vary depending on the dataset, task, and computational sources. Figure 1 shows some hyper-parameters used for training and testing of the main study models including VGG19, GoogleNet, ResNet, and EfficientNet.
The size of input images for all models before the activation function layer was 100 x 100 pixels with 3 channels. The parameters for each model, which was set in this study are presented in Figure 2.

Figure 1. Hyperparameters equally used by all models in this study. Learning rate, batch size, number of epochs, loss function and other parameters mentioned here were set equally for all models

Confusion matrix:
Confusion matrix, also known as the error matrix, is used for machine learning and statistical classification. This matrix has four variables for model assessment in the software: TP, which indicates correct detection of carious teeth, FP indicating incorrect detection of a sound tooth as carious, TN indicating correct detection of sound teeth, and FN indicating incorrect detection of a carious tooth as sound [38]. Assessment was performed using 1200 images independent of those used for training. The confusion matrix for VGG19, GoogleNet, ResNet, and EfficientNet is shown in Figure 3.

Figure 2. Section A is the list of parameters used by the VGG19 model. Section B is the list of parameters used by the GoogleNet model. Section C is the list of parameters used by the ResNet model. Section D is the list of parameters used by the EfficientNet model to detect dental caries

Results

Figure 4 shows the diagram of EfficientNet model structure for caries detection. The first step includes image input, and then a series of blocks containing convolutional algorithm in sizes 3 x 3 and 5 x 5 form. After processing by the last convolutional block, feature mapping is performed and classification for output determination (sound/carious) is done. Table 1 compares the accuracy, precision, sensitivity, specificity, precision (PPV), NPV and F1-score of MobileNetV2, VGG16, VGG19, ResNet50, EfficientNet, and GoogleNet models for caries detection after calculation of the confusion matrix. As presented in Table 1, EfficientNet had a superior performance for caries detection and yielded the best results with 94.7% accuracy for caries detection.

Figure 3. Confusion matrix for each model used in this study. Section A is the confusion matrix for the VGG19 network. Section B is the confusion matrix for the ResNet network. Section C is the confusion matrix for the GoogleNet network. Section D is the confusion matrix for the EfficientNet network
Figure 4. EfficientNet model diagram used in this study consists of seven blocks which are shown in different colors. The basic building block of EfficientNet-B0 is a mobile inverted bottleneck convolution (MBConv), while each MBConv block is shown with the corresponding kernel filter size. Finally, after block 7 and feature map, the model classifies dental image as sound or carious
Table 1. Comparison of the accuracy, precision, PPV, NPV, sensitivity, specificity, and F1-score of different models on the datasets

Visual inspection is always the first step in assessment of dental caries. However, visual-tactile examination is not sufficient for detection of interproximal and occlusal caries. Thus, radiography is conventionally requested for assessment of caries, since it can provide additional information about the clinical progression of caries. Different radiographic modalities may be used for caries detection. However, posterior bitewing, periapical, and panoramic radiographic modalities are most commonly requested by operative dentists. It should be noted that radiography alone is not suitable for caries detection since it cannot differentiate between cavitated and non-cavitated, and active or arrested caries [39,40].
In the present study, deep CNNs were used for caries detection by machine learning. EfficientNet showed a superior performance; however, its performance should be assessed over long periods of time, and the model may need to be trained again with a higher number of new images [41,42]. In general, training of deep CNNs is difficult due to gradient vanishing. In this study, this problem was solved by proper initialization, batch normalization, using pretrained models, and pretraining. Also, rnd.next (50000,90000) function of C# programming language was used for random image selection.
Medical and dental tools and instruments are verified for use in clinical trials and by competent authorities such as the EU standards and the Food and Drug Administration. Nonetheless, AI programs for health have not been supported by robust evaluations. Randomized controlled trials that can reveal the benefits and effectiveness of AI for medical and dental diagnoses in the clinical setting are in initial steps, and limited studies are available in this regard. For instance, the optimal efficacy of AI for caries detection has been previously reported. A randomized controlled clinical trial by Mertens et al. [43] was conducted in 2021 in Germany. They randomly selected 140 bitewing radiographs of patients and randomly classified them into 7 groups (n=20). Seven dentists were asked to detect caries with and without using an AI tool. The results revealed that AI can increase the diagnostic accuracy of dentists mainly through enhancement of sensitivity in detection of enamel lesions [43]. Future controlled clinical trials are required to use the model suggested in this study for caries detection. Also, to promote the model, novel models such as ConvNeXt should be evaluated and compared for caries detection [44]. Furthermore, future studies are recommended on other models such as deep vision transformer neural networks presented by Google, in which pieces of an image positioned next to each other are used consecutively as input (such as words in a sentence) or self-attention mechanism instead of convolutional layers [45]. Additionally, increasing the train and test datasets would improve the system performance. Also, the model should be verified in other universities and settings to increase the generalizability of the AI tool for caries detection.

Conclusion

The present results showed that EfficientNet model was the best deep learning model for caries detection within a specific time period using a specific dataset obtained from Shahid Beheshti Dental School, and suggest that refining EfficentNet and its new versions can improve its performance in caries detection.

Type of Study: Original article | Subject: Restorative Dentistry

Send email to the article author

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.