Enhancing lung cancer diagnosis with data fusion and mobile edge computing using DenseNet and CNN

The recent advancements in automated lung cancer diagnosis through the application of Convolutional Neural Networks (CNN) on Computed Tomography (CT) scans have marked a significant leap in medical imaging and diag‑ nostics. The precision of these CNN‑based classifiers in detecting and analyzing lung cancer symptoms has opened new avenues in early detection and treatment planning. However, despite these technological strides, there are critical areas that require further exploration and development. In this landscape, computer‑aided diagnostic sys‑ tems and artificial intelligence, particularly deep learning methods like the region proposal network, the dual path network, and local binary patterns, have become pivotal. However, these methods face challenges such as limited interpretability, data variability handling issues, and insufficient generalization. Addressing these challenges is key to enhancing early detection and accurate diagnosis, fundamental for effective treatment planning and improving patient outcomes. This study introduces an advanced approach that combines a Convolutional Neural Network (CNN) with DenseNet, leveraging data fusion and mobile edge computing for lung cancer identification and classification. The integration of data fusion techniques enables the system to amalgamate information from multiple sources, enhancing the robustness and accuracy of the model. Mobile edge computing facilitates faster processing and analy‑ sis of CT scan images by bringing computational resources closer to the data source, crucial for real‑time applications. The images undergo preprocessing, including resizing and rescaling, to optimize feature extraction. The DenseNet‑ CNN model, strengthened by data fusion and edge computing capabilities, excels in extracting and learning features from these CT scans, effectively distinguishing between healthy and cancerous lung tissues. The classification catego‑ ries include Normal, Benign, and Malignant, with the latter further sub‑categorized into adenocarcinoma, squamous cell carcinoma, and large cell carcinoma. In controlled experiments, this approach outperformed existing state‑of‑ the‑art methods, achieving an impressive accuracy of 99%. This indicates its potential as a powerful tool in the early detection and classification of lung cancer, a significant advancement in medical imaging and diagnostic technology.


Introduction
Lung cancer stands as one of the most significant health challenges in modern medicine.As a leading cause of cancer-related deaths globally, it affects millions of people each year, regardless of gender or background [1].The significance of lung cancer lies not only in its prevalence but also in its impact on patients' lives and the healthcare system.One of the key challenges with lung cancer is its typically late diagnosis.Many patients present with advanced disease because early-stage lung cancer often does not cause noticeable symptoms [2].This delay in detection significantly hampers effective treatment options, leading to lower survival rates.For instance, when lung cancer is diagnosed at an early stage, the 5-year survival rate can be as high as 56%, but this drops dramatically to about 5% for advanced stages [3].Moreover, lung cancer presents a significant burden due to its associated healthcare costs and its impact on patients' quality of life [4,5].The treatment for lung cancer, which may include surgery, chemotherapy, radiation therapy, or a combination of these, can be physically and emotionally taxing for patients and their families [6].This is compounded by the fact that lung cancer treatment is often expensive, contributing to the economic burden on both individuals and healthcare systems [7].
The significance of lung cancer also extends into the realm of public health and prevention [8].Many cases of lung cancer are linked to preventable causes, most notably smoking, which is the leading cause of lung cancer worldwide [9].This emphasizes the importance of public health initiatives focused on smoking cessation and reducing exposure to lung cancer risk factors like secondhand smoke, air pollution, and occupational hazards.In recent years, advancements in research and technology, particularly in the field of early detection and targeted therapies, have offered new hope [10].Improved screening techniques, such as low-dose CT scans for high-risk individuals, have the potential to detect lung cancer earlier, thus improving treatment outcomes [11].Moreover, the development of targeted therapies and immunotherapies has revolutionized the treatment landscape for lung cancer, offering more personalized and effective treatment options [12].The advancement of diagnostic techniques for lung cancer has become a pivotal area of research, especially considering the complexity and irregularity of radiographic images, which pose significant challenges for radiologists [13].Lung cancer, characterized by the formation of malignant nodules, demands precise and early detection for effective treatment.Computed Tomography (CT) imaging has emerged as a superior diagnostic tool in this context, outperforming conventional radiography methods [14].Its ability to accurately delineate the size and location of cancerous nodules has made it indispensable in the early detection of lung cancer.Notably, studies have shown that low-dose CT screening substantially enhances early-stage malignancy detection, leading to a remarkable 20.0% reduction in mortality rates, along with a higher incidence of positive screening outcomes [15].
However, the analysis of CT images is intricate and time-intensive, necessitating advanced computational assistance.In this vein, deep learning (DL) algorithms, particularly Convolutional Neural Networks (CNNs), have gained prominence. DL, a sophisticated branch of machine learning, leverages structured, hierarchical data processing to extract high-level summaries from complex datasets.In medical imaging, this approach has proven invaluable.By training on extensive datasets of CT scans, DL algorithms can identify and interpret patterns indicative of lung cancer, enabling them to predict the presence of the disease [16,17].
Deep learning, a subset of machine learning, is a method based on artificial neural networks, which are inspired by the structure and function of the human brain.Deep learning models are particularly adept at processing large volumes of data, learning complex patterns, and making informed decisions based on those patterns.These capabilities make deep learning highly effective for tasks involving image recognition, natural language processing, and, crucially, medical image analysis [18,19].
In the context of lung cancer classification, deep learning models, particularly Convolutional Neural Networks (CNNs), have become instrumental.CNNs are designed to automatically and adaptively learn spatial hierarchies of features from input images.This is achieved through a series of layers that mimic the human visual cortex, processing various aspects of the image, such as edges, textures, and complex patterns [20,21].

Image preprocessing
The first step in lung cancer classification using deep learning involves preprocessing the input images.In the case of lung cancer, these are typically CT scans.Preprocessing may include resizing images, enhancing contrast, and removing irrelevant noise.This standardization is crucial for the model to focus on relevant features.

Feature extraction
Once the images are preprocessed, the CNN begins the task of feature extraction.This involves the convolutional layers applying various filters to detect specific features like shapes, edges, and textures.Each layer extracts increasingly complex features: initial layers may identify simple edges, while deeper layers might recognize more complex structures pertinent to lung nodules.

Classification
After feature extraction, the model uses these features to classify the images.In lung cancer classification, this usually involves categorizing the scans into normal, benign, or malignant categories, and potentially further classifying types of lung cancer such as adenocarcinoma or squamous cell carcinoma.This is typically done using fully connected layers in the network that interpret the features extracted by the convolutional layers to make a classification decision.

Model training and validation
Training a deep learning model for lung cancer classification involves feeding it a large dataset of labeled CT scans.The model learns by adjusting its internal parameters to minimize the difference between its predictions and the actual labels.Validation is done using a separate set of images not seen by the model during training to assess its accuracy and generalizability.

Interpretation and integration in clinical practice
The final step is interpreting the model's output in a clinical setting.This involves integrating the model's classifications with radiologists' expertise to ensure accurate diagnosis and treatment planning.The model's interpretability, or its ability to provide insight into why it made a certain classification, is crucial in gaining the trust of medical practitioners.
Deep learning in lung cancer classification offers the potential for highly accurate, efficient, and early detection of lung cancer, which is pivotal in improving patient outcomes [22].However, challenges such as ensuring the model's generalizability across different populations and CT scan machines, and the need for large annotated datasets for training, remain key areas for ongoing research and development [23].The efficacy of DL in medical imaging extends beyond lung cancer.Researchers have successfully applied these algorithms in the detection of various cancers, including prostate, neck and head, breast, skin, and even in the identification of brain tumors [24].These achievements underscore the transformative potential of DL in medical diagnostics, providing a pathway for rapid, accurate, and efficient analysis of medical images.
The integration of DL in lung cancer diagnostics not only augments the accuracy of diagnoses but also significantly reduces the time required for radiologists to interpret complex CT images [25].This synergy between advanced computational methods and traditional medical expertise is paving the way for a new era in cancer diagnostics, where precision and speed are paramount.As DL continues to evolve and improve, its applications in medical imaging are expected to expand, offering promising prospects for the early detection and treatment of various forms of cancer [26,27].
The main contributions of this study in the context of lung cancer detection and classification using the integrated Convolutional Neural Network (CNN) and DenseNet network approach can be summarized as follows: • Innovative Integration of CNN  These contributions highlight the study's impact on advancing lung cancer diagnostics, potentially transforming current practices and offering a more effective, accurate, and reliable tool for radiologists in the early detection and classification of lung cancer.

Related work
In the rapidly evolving field of lung cancer detection and classification, a diverse array of methodologies harnessing the power of deep learning and artificial intelligence has been developed, each contributing uniquely to the advancement of diagnostic accuracy and efficiency.

Innovative deep learning strategies
Advanced deep learning models utilizing vessel filters, as proposed by Hendrix et al., have shown promise in enhancing detection precision for nodules as small as 3 mm, marking a significant step forward in identifying early-stage lung cancer [28].
Neil Jousha and team's development of a 3D deep CNN model represents a milestone in the field, achieving an exceptional 98% accuracy in nodule identification, setting a new standard in lung cancer diagnostics [29].

Combining knowledge-based systems and machine learning
The integration of knowledge-based systems with CNNs, as demonstrated by H. Guo and colleagues, offers a novel approach in predicting lung cancer-related mortality [30].The use of data augmentation and SVM in conjunction with CNN showcases an innovative method of leveraging multiple AI techniques for enhanced predictive accuracy [31].

Exploring DenseNet and multiple view-based models
The employment of DenseNet CNN models by Quasar and team [32], and the introduction of multiple viewbased deep CNNs by Munoz-Aseguinolaza and colleagues, underscore the ongoing efforts to refine lung cancer classification methods, achieving accuracy rates that inspire further research and application [33].

Region-based CNNs and deep reinforcement learning
Mridha's, Hussain ali's and Iqbal's work use of a regionbased CNN combined with active and self-paced learning methodologies emphasizes the potential of specialized neural network training for improved lung cancer detection [34][35][36].
Chen's exploration of deep reinforcement learning models, such as DQN and H-DQN, opens new avenues in lung cancer classification, utilizing multilayer neural networks for sophisticated data analysis [37].

3D CNNs and ANN-based models for enhanced sensitivity
The development of three-dimensional CNNs and ANN-based models, as undertaken in studies [38] and [39], respectively, highlight the continuous evolution of deep learning architectures in achieving high sensitivity and accuracy in lung cancer detection.

MLP and automated approaches for comprehensive analysis
Qing Gao and Li Y and their colleagues's use of MLP for classifying mutations in lung nodules, and other studies employing automated neural network systems for lung cancer diagnosis, illustrate the diverse applications of AI in understanding and interpreting complex medical data [40,41].

Transfer learning and specialized CNN models
The implementation of transfer learning-based models, such as Inception v3, and specialized CNN models for early-stage lung cancer detection, although facing challenges in accuracy, represent important steps towards developing adaptable and efficient diagnostic tools.

Diverse techniques for NSCLC and histopathology analysis
Research into NSCLC utilizing techniques such as median filtering, segmentation, and deep residual neural networks based on transfer learning, demonstrates the breadth of approaches being explored to tackle various subtypes of lung cancer [42,43].

Hybrid models and data integration for early detection
The innovative use of hybrid models like MSER-SURF by Moitra et al., and the integration of laboratory and clinical data for early detection, signify the growing trend of combining various data types and analytical methods for a more holistic approach to lung cancer diagnostics [44].

Emerging trends and future directions
The consistent theme across these studies is the pursuit of higher accuracy, sensitivity, and specificity in lung cancer detection.Future research is likely to focus on enhancing the generalizability of these models across diverse populations, integrating multi-modal data sources, and improving the interpretability of AI systems to better support clinical decision-making.The potential of AI in personalizing lung cancer treatment, predicting patient outcomes, and aiding in real-time decision-making during surgical procedures also represents a significant area for future exploration.

Proposed methodology
The proposed methodology is shown in Fig. 1, The CT scan image dataset is acquired and preprocessing is applied using rescaling and resizing to meet the CNN requirement of input image data.Data augmentation, defining CNN layers, feature extraction and optimization, model training, and classification are major steps as explained in the next sections.
Designing an algorithm for lung cancer classification using a combination of DENSENET121 and a Convolutional Neural Network (CNN), along with preprocessing steps, involves several key stages.Here's a structured outline algorithm:

Result
Various evaluation matrices including training accuracy and accuracy, training loss and validation loss, precision, recall, f1-score, and confusion matrix are utilized to evaluate the performance of the proposed model for lung cancer classification using CT scan images.A confusion matrix helps represent the overall number of correct predictions as T p (true positives) and the num- ber of true labels predicted incorrectly by the model as F n (false negatives).It also includes F p (false positives) and T n (true negative).It proves helpful in assessing the F1 score, accuracy, and recall of a trained model.Precision is the ratio of correctly predicted true labels to the total number of labels predicted as true by the model as follows: Recall, also known as sensitivity or true positive rate, is the proportion of correctly predicted true labels out of all the true labels.It is calculated by the following equation: The F1-score is calculated as the harmonic mean of recall and accuracy.The following criteria are used to assess model accuracy: Accuracy is an evaluation metric utilized for model performance, representing the percentage of correct predictions.It indicates the total number of images correctly classified during the testing phase.It is calculated as follows:

Confusion matrix
In the classification of lung cancer types, specifically Lung Squamous Cell Carcinoma (SCC), Normal lung tissue (N), and Lung Adenocarcinoma (ACA), the significance of a confusion matrix is multifaceted.It serves as a crucial tool for assessing the accuracy and reliability of a diagnostic model.By presenting a detailed breakdown of true and false positives and negatives for each lung tissue type, it allows for a nuanced analysis of the model's performance.This is especially important in medical diagnostics, where the cost of misclassification can be (1) For instance, the matrix can highlight whether the model tends to confuse Lung SCC with Lung ACA, or if it accurately distinguishes between cancerous tissues and normal tissues.Such insights are invaluable in refining the model, ensuring precise diagnoses, and informing clinical decision-making.Moreover, in a field where differentiating between various cancer types is vital for treatment planning, the confusion matrix provides an essential quantitative measure of a model's diagnostic capability.The Table 1 is a confusion matrix for a classification model used to distinguish between Lung Adenocarcinoma (ACA), Normal lung tissue (N), and Lung Squamous Cell Carcinoma (SCC).Here's an interpretation of the data: Lung Adenocarcinoma (ACA): True Positives (TP): 1493 cases were correctly identified as Lung ACA.
False Negatives (FN): 19 cases of Lung SCC were incorrectly identified as Lung ACA.
False Positives (FP): 3 cases of Lung N were incorrectly identified as Lung ACA.
The model shows strong performance in identifying Lung ACA but has some confusion with Lung SCC.
Normal Lung (N): TP: 1517 cases were correctly identified as Normal lung tissue.
FN: There were no cases of Lung SCC incorrectly identified as Normal lung tissue.
FP: Only 1 case of Lung ACA was incorrectly identified as Normal lung tissue.
This indicates an excellent performance in correctly identifying Normal lung tissue with very minimal misclassification.
Lung Squamous Cell Carcinoma (SCC): TP: 1458 cases were correctly identified as Lung SCC.FN: 9 cases of Lung ACA were incorrectly identified as Lung SCC.
FP: No misclassification with Normal lung tissue.The model is highly accurate in identifying Lung SCC, though it shows some confusion with Lung ACA as shown in Table 1.Overall, the model demonstrates high accuracy in classifying these lung conditions, with the most significant confusion occurring between Lung ACA and Lung SCC.The low number of false positives and negatives for each category signifies the model's robustness in distinguishing these lung conditions, which is crucial for accurate diagnosis and treatment planning in medical practice.

Evaluation matrix
The performance matrix in Table 2 shows various metrics evaluating a model used for classifying lung cancer types: Lung Adenocarcinoma (ACA), Normal lung tissue (N), and Lung Squamous Cell Carcinoma (SCC).

Precision
High precision values (0.992 for Lung ACA, 0.999 for Lung N, and 0.987 for Lung SCC) indicate that the model has a low rate of false positives.It rarely misclassifies other types of lung tissue as these specific conditions.

Recall
The high recall scores suggest that the model is effective in identifying true cases of each condition, with particularly strong performance in identifying normal lung tissue (0.998) and Lung SCC (0.994).

F1 score
The F1 score, a balance between precision and recall, is high across all categories, indicating that the model is both accurate and reliable.

Specificity
Near-perfect specificity, especially for Lung N (1.000), shows the model's strength in correctly identifying negatives, i.e., correctly recognizing non-cancerous or different cancer types.

False omission rate (FOR) and false discovery rate (FDR)
Both are low, indicating a low rate of false negatives and false positives, respectively.

Negative predictive value (NPV)
High NPV values suggest that when the model predicts a tissue type is not present, it is highly likely to be correct.

False negative rate (FNR)
These are low, meaning the model rarely misses actual cases of each tissue type.

Matthews correlation coefficient (MCC), jaccard index and dice coefficient
These are all high (close to 1), signifying excellent overall performance and a strong correlation between the predicted and actual values.

Error rate
At 0.004, this indicates a very low overall rate of misclassifications.

Accuracy
A high accuracy of 0.993 shows that the model correctly classifies the tissue types in a vast majority of cases.In summary, this model demonstrates exceptional performance in distinguishing between Lung ACA, Lung N, and Lung SCC, with high accuracy and reliability, making it a potent tool in medical diagnostics and treatment planning.

Conclusions
The conclusion highlighting the advancements and potential of convolutional neural networks (CNNs) in medical imaging, especially for lung cancer identification and classification using CT scans, is indeed wellfounded.Incorporating the significance of DenseNet121 in this context further strengthens this conclusion.DenseNet121, as a variant of the Dense Convolutional Network (DenseNet), brings a unique architecture that facilitates more efficient training of neural networks with fewer parameters compared to traditional CNNs.Its design, characterized by connecting each layer to every other layer in a feed-forward fashion, ensures maximum information flow between layers in the network.This aspect is particularly crucial in medical imaging where capturing intricate details is vital for accurate diagnosis.
In the realm of lung cancer identification and classification: • Feature Extraction: DenseNet121 excels in extracting detailed features from CT scan images, which is critical for identifying subtle and early signs of lung can-cer.This results in improved detection rates, especially for small nodules that are often challenging to detect.

• Reduced
Overfitting: The architecture of DenseNet121 inherently reduces the risk of overfitting, a common challenge in medical imaging tasks due to the limited availability of annotated medical images.This is achieved through its efficient use of parameters and feature reuse, leading to more robust models.
• Enhanced Gradient Flow: The direct connections between all layers ensure that the gradient flow is maintained throughout the network, facilitating deeper model training without the vanishing gradient problem.This is particularly beneficial for learning from complex, high-dimensional medical images like CT scans.• Improved Accuracy and Efficiency: DenseNet121 has demonstrated superior performance in terms of accuracy in various studies for lung cancer detection and classification.Moreover, its efficient use of parameters makes it computationally less intensive, enabling faster training and inference, which is crucial in clinical settings where time is often a critical factor.
Therefore, the integration of DenseNet121 into CNN-based approaches for lung cancer detection and classification using CT scans not only enhances the performance of these models but also signifies a notable advancement in leveraging deep learning for medical diagnostics, potentially leading to earlier and more accurate lung cancer detection.

Table 1
Confusion matrix

Table 2
Evaluation matrix for classification