1 Introduction

The SARS-CoV-2 known as novel coronavirus is a communicable virus that belongs to the brood of coronaviruses. SARS-CoV-2 is 80% similar to SARS-CoV-1 in terms of genetic sequence and both are originated from bats [1, 2]. The disease caused by SARS-CoV-2 is known as COVID-19. The first infected case of this virus was reported in the Wuhan city of China in December 2019. Thereafter, COVID-19 was declared a global pandemic on 11 March 2020 [3]. This virus has promptly spread in the whole world to date. As of 12 December 2020, more than 20 million persons have been infected and more than 1.6 million persons have died due to this deadly virus [4]. These numbers are continuously increased due to the possibility of a second wave of coronavirus. The resources such as ICU, ventilators, and protecting kits are becoming exhausted due to the increase in the number of COVID-19 patients. The main symptoms of COVID-19 are fever, cough, smell loss, fatigue, and shortness in breathing [5, 6]. In some cases, infected persons may be asymptomatic. Due to the contiguous nature of the virus, the early detection of COVID-19 is required. Nowadays, the infection of coronavirus is tested through Reverse Transcription-Polymerase Chain Reaction (RT-PCR). However, RT-PCR test is time-consuming and has high false-negative [7]. To resolve the above-mentioned problems, radiographic imaging techniques such as computed tomography(CT) and X-ray are used [8]. These techniques are used to quantify the disease severity in the lung infection. CT provides better sensitivity than the X-ray and RT-PCR [9]. The analysis of CT scans requires radiological experts. In this pandemic situation, a lot of burdens have been put on radiological experts for analysis. To alleviate this issue, deep learning techniques are utilized. The pre-trained deep learning models are used for automatic analysis of chest CT scan [10, 11]. The sensitivity and specificity of pre-trained deep learning models are still far from the optimal results. The manual analysis of CT scan takes a lot of time and error-prone.

The main contributions of this paper are three-fold as follows:

  1. 1.

    To handle the less sensitivity issue with RT-PCR, chest CTs images are utilized in this paper to classify the suspected subjects as COVID-19 (+), tuberculosis, pneumonia, or healthy subjects.

  2. 2.

    An automated COVID-19 screening model is implemented by ensembling the deep transfer learning models such as Densely connected convolutional networks (DCCNs), ResNet152V2, and VGG16.

  3. 3.

    The proposed ensemble model outperforms the competitive models in terms of various performance measures such as accuracy, f-measure, area under curve, sensitivity, and specificity.

The rest of this paper is organized as follows. Section 2 presents related works. Section 3 presents the proposed deep learning model. The experimental results and discussion are mentioned in Section 4. The concluding remarks are drawn in Section 5.

2 Related works

Mukherjee et al. [12] presented a lightweight CNN model for classification of COVID-19 infected persons. Chest CT-scan and X-ray were used to validate the proposed ensemble model and reported the accuracy of 96.28%. The performance of the proposed ensemble model can be further improved by reducing the computational time. Singh et al. [13] developed an automatic approach for chest CT-based COVID-19 classification. They used CNN model to categorize whether the person is infected with coronavirus or other viruses. A multi-objective function was designed by considering specificity and sensitivity. The proposed ensemble model outperforms the other competitive algorithms in terms of accuracy, sensitivity, specificity, and F-measure. Wang et al. [14] proposed a novel learning framework for the identification of COVID-19 patients. They redesigned COVID-Net by incorporate the modification in architecture and learning strategy. The proposed framework achieved improvement over COVID-Net by 12.6% in terms of accuracy. However, the performance of this framework can be enhanced by utilizing transfer learning. Javaheri et al. [15] implemented a CovidCTNet for detection of coronavirus infection in patients using CT scans. The performance of CovidCTNet was tested on 89,145 slices of all CT scans of 287 patients. The accuracy obtained from CovidCTNet was 90%. Mishra et al. [16] proposed a fusion approach that integrated the predictions of existing deep CNN models. The proposed approach was trained on the COVID-CT dataset and attained an accuracy of 86%. However, the proposed approach was only applied to axial slices. Ahuja et al. [17] implemented a three-phase deep learning model for the identification of COVID-19 patients using CT scans. Data augmentation was applied to the CT scans. Four pre-trained deep architectures namely ResNet18, ResNet50, ResNet101, and SqueezeNet were used for disease classification. They reported that the accuracy obtained from ResNet18 was 99.65%. Liu et al. [18] designed a lesion-attention deep neural network to predict the COVID-19 in chest CT scans. The weights of VGG16, ResNet18, ResNet50, DenseNet121, DenseNet169, and EfficientNetwere utilized in the proposed approach. The proposed approach was trained on 746 chest CT scans and attained a classification accuracy of 88.6%. Wang et al. [19] developed an automatic deep learning framework for discriminating coronavirus infected persons from CT scans. UNet was trained for lung segmentation. This pre-trained net was used to obtain the lung mask for CT scans. The proposed DeCoVNet was trained on 499 CT scans and achieved a classification accuracy of 90.1%. The main limitations of this model are inadequate use of temporal information and applicability on a small dataset. Harmon et al. [20] implemented the 3D anisotropic hybrid network architecture for the classification of corona infected persons. DenseNet121 was used to segment 3D lung regions from the chest CT scan. The sensitivity and specificity obtained from this model were 84% and 93%, respectively. However, the small dataset was used for validation process. Xu et al. [21] presented an automatic screening model for COVID-19 in CT scans. This model was tested on 618 CT scans and attained an accuracy of 86.7%. However, the manifestation of other pneumonia may be overlapped with COVID-19. Due to this, the performance of this model was greatly affected. Bai et al. [22] designed an artificial intelligence system to discriminate the COVID-19 from other pneumonia. EfficientNet B4 was used in this system for classification. This system was tested on the CT scans of 1186 patients and achieved an accuracy of 96%. However, the validation of the proposed approach was tested on a limited sample size. Han et al. [23] proposed an attention-based deep learning model for screening of COVID-19. An attention mechanism and deep multiple instance learning were utilized in this model. The classification accuracy of this model was 94.3%. Li et al. [24] developed a COVNet model for classification of coronavirus infected patients using CT scans. The pre-trained ResNet50 deep learning model was used in COVNet. COVNet was trained on 4352 chest CT scans. The sensitivity and specificity obtained from COVNet were 90% and 96%, respectively. Ni et al. [25] proposed a model that consists of 3D UNet and MVPNet. 3D UNet was used to extract the lesion in the given CT scan. 19,291 CT scans were used for experimentation. The classification accuracy obtained from this model was 94%. Xiao et al. [26] implemented a pre-trained ResNet34 to envisage the severity of COVID-19 using chest CT scans. 23,812 CT scans were used for experimentation. This model attained the accuracies of 97.4% and 81.9% in training and testing sets, respectively. The main limitations of this model are biases in the results due to the small sample size, lack of transparency, and interpretability. Pu et al. [27] developed an automatic approach to perceive and quantify the pneumonia regions of CT scans associated with COVID-19. UNet framework was used to segment lung boundary, which is further used to assess the progression of disease. They used both deep learning and computer vision techniques. The sensitivity and specificity obtained from this approach were 95%, and 84%, respectively. Jaiswal et al. [28] proposed an automatic tool for detection of COVID-19 using chest CT scans. The pretrainedDenseNet201 model was used to classify the patients as coronavirus infected or not. The features were extracted from DenseNet201model using the weights learned from the ImageNet dataset. The classification accuracy obtained from this model was 97%. The problem associated with the above-mentioned techniques can be resolved by developing a novel deep learning framework to classify and monitor the progression severity of COVID-19. Therefore, a novel deep learning framework is developed in this paper.

3 Proposed ensemble deep transfer learning model

This section discusses the proposed ensemble model for screening of suspected patients.

3.1 Motivation

Recently many researchers have started working on the diagnosis of suspected COVID-19 infected persons using deep transfer learning models. These models have shown significantly better results, especially for early-stage classification at rapid speed as compared to RT-PCR. However, these models are still suffer from the over-fitting issue. Additionally, some models are effective for COVID-19 diagnosis but not for similar diseases such as tuberculosis and pneumonia. Therefore, in this paper, an ensemble DCCNs is designed for diagnosis the suspected objects into four classes. These four classes are COVID-19(+), tuberculosis, pneumonia, and healthy subjects.

3.2 Ensemble DCCNs

In this paper, ResNet152V2 [29], DenseNet201 [30], and VGG16 [31] models are considered to build the ensemble model. As known in prior ensembling of models [32, 33] can obtain more efficient results. It can boost feature extraction and accuracy of the used models [34]. Figure 1 shows the ensemble DCCNs. 64 neurons are used for initial dense layer [35, 36]. A fine-tuned pre-trained transfer learning model is used with many layers to extract the features. A softmax activation function is utilized. The models are trained for 10 epochs with a batch size of 10. During the tuning of initial attributes, fully connected layers [37, 38] with 64 neurons along with dropout of 0.3 and 0.2, respectively are used. To overcome the over-fitting issue, regularization [39, 40] is also done by considering the concept of early stopping. lr = 0.00001 is used as learning rate.

Fig. 1
figure 1

Proposed ensemble densely connected convolutional neural networks

3.3 Layers of DCCNs

A.

Pooling Layers

Pooling layers are utilized to achieve multi-scale analysis. These layers can also minimize the size of input data for feature reduction. Average and Max pooling layers are extensively used in CNNs. An average pooling layer (La) is utilized in this paper and can be computed as [41]:

$$ L_{a} = \frac{\sum d_{i}}{|d_{i}|} $$
(1)

Here, |di| and i define cardinal number and pooling region, respectively. d represents the activation set in i and can be computed as:

$$ d =\alpha_{j} $$
(2)

where j ∈ pooling region (i).

B.

Softmax and fully connected layer

The fully connected layer comes up with full connections with all the neurons. Its input is multiplied with its weight matrix to obtain the multiplicative result. Additionally, softmax is used as an activation function. Fully connected layers utilize conditional probability to achieve its functionality (more details please see [41]). Softmax (Pty(C,b)) defines probability of a subject s belong to given class C. It can be computed as:

$$ P_{ty}(C,b) = \frac{P_{ty}(b,C)\times P_{ty}(C)}{{\sum}_{N=1}^{C}(N)\times P_{ty}(b,N)} $$
(3)

Here, Pty(C) defines class probability. C shows total number of classes. Pty(C) defines probability of given C.

Equation (3) can be rewritten as:

$$ softmax= p_{ty} (C,b)=\frac{exp(\beta^{C}[b])}{{\sum}_{N=1}^{C}exp(\beta^{C}[b])} $$
(4)

where

$$ \beta^{C}= In [P_{ty}(b,C)\times P_{ty} (C)] $$
(5)

C. Dropout

Dropouts are utilized to regularize CNNs by randomly placing the outgoing neurons at hidden layers to 0 during the model building process. These neurons have no effect on back-propagation during the model building stage. It can prevent over-fitting issue [42].

D. Rectified linear units

Rectified linear units (ReLU) can be computed as:

$$ F_{relu}(\alpha)=max(0,\alpha) $$
(6)

ReLU takes lesser time as compared to the sigmoidal functions. It can efficiently handle gradient-based training and poor performance with CNNs due to the widespread saturation. It can provide faster convergence of stochastic gradient descent [41].

4 Performance analysis

To evaluate the performance of the proposed ensemble model, the experiments are carried on MATLAB 2020b software with 64-bit 8-core, and 32GB RAM. The competitive models are also implemented on the same platform and dataset.

4.1 Dataset

Four class chest-CT scanned images are collected from various sources such as COVID-19 [43], Pneumonia [44, 45], and Tuberculosis [13, 46]. We have collected group of chest CT scanned images as 2373 images of COVID-19 infected patients, 2890 pneumonia infected patients, 3193 tuberculosis images and 3038 healthy subjects. Figure 2 shows a view of chest-CT scanned images dataset. Out of this 65% fraction of total images are utilized for training purpose and remaining are used for testing purpose.

Fig. 2
figure 2

A view of chest-CT scanned images dataset

4.2 Comparative analyses

Figure 3 shows the accuracy and loss analysis of the proposed ensemble model on training dataset. The binary cross-entropy is used as a loss function. It clearly shows that the proposed ensemble model convergence to the optimal results after 33 iterations and 6th epoch. The proposed ensemble model convergence at a good speed.

Fig. 3
figure 3

Accuracy and loss analysis of the proposed ensemble model on training dataset

Figure 4 shows the confusion matrix of the obtained classification results from the proposed ensemble model on the testing dataset. The proposed ensemble model achieves an average accuracy of 99.2% on a training dataset. Thus, the proposed ensemble model provide significantly better screening results for suspected subjects as it does not much suffer from the overfitting issue.

Fig. 4
figure 4

Confusion matrix of the proposed ensemble model on the testing dataset

Table 1 shows the comparative analysis among the proposed and the competitive deep transfer learning models in terms of accuracy, f-measure, area under curve, sensitivity, and specificity. Bold values indicate the best results. The proposed ensemble model outperforms competitive models. The proposed ensemble model achieves good performance as compared to the competitive models in terms of accuracy, f-measure, area under curve, sensitivity, and specificity by 1.2738%, 1.3274%, 1.8372%, 1.283%, and 1.8382%, respectively.

Table 1 Testing analysis (in %) of the proposed automated screening model

Table 2 shows the performance analysis of the proposed automated screening model over the existing models. The results reveals that the proposed ensemble model outperforms the existing models. Additionally, most of the existing work has focused on binary class or three-class classification chest CT scans dataset. Therefore, besides accuracy, the proposed ensemble model can handle a four-class problem in an efficient manner.

Table 2 Performance analysis of the proposed automated screening model over the existing models

5 Conclusion

In this paper, an ensemble deep learning model was proposed for COVID-19 classification in chest CT scan images. The proposed ensemble model utilized the three well-known models namely DCCNs, ResNet152V2 and VGG16. The proposed ensemble model is able to handle the sensitivity issue that is associated with RT-PCR. The proposed ensemble model has been tested on large chest CT dataset and compared with fifteen competitive models. Experimental results reveal that the proposed ensemble model outperforms the existing models in terms of accuracy, f-measure, area under curve, sensitivity, and specificity by 1.2738%, 1.3274%, 1.8372%, 1.283%, and 1.8382%, respectively.