Detection of cotton leaf curl disease’s susceptibility scale level based on deep learning

Cotton, a crucial cash crop in Pakistan, faces persistent threats from diseases, notably the Cotton Leaf Curl Virus (CLCuV). Detecting these diseases accurately and early is vital for effective management. This paper offers a compre‑ hensive account of the process involved in collecting, preprocessing


Introduction
Plant diseases stand as formidable adversaries, wielding significant influence over food production.They exact a heavy toll on crop yields, inflicting economic losses and at times even obstructing agricultural activities.As emphasized by [1], the effective management and control of diseases are paramount to mitigate output losses and ensure the sustainability of agriculture.This underscores the critical need for ongoing crop monitoring paired with swift and precise disease detection.In tandem with the burgeoning global population, there arises an unprecedented demand for increased food production, as articulated by the Food and Agriculture Organization (FAO) [2].This imperative must harmonize with the imperative to preserve natural ecosystems, advocating for environmentally friendly farming practices.The challenge is multifaceted: sustaining the nutritional value of food while ensuring its widespread safety [3][4][5][6][7].Meeting this challenge necessitates the adoption of novel scientific approaches for diagnosing leaf diseases and managing crops.Furthermore, these innovative technologies must extend their reach to encompass large-scale ecosystem surveillance.At the heart of this quest lies the crucial task of accurately identifying crop diseases [8].Manual techniques, as delineated by Miller et al., fall short when it comes to covering extensive crop areas and providing early insights for decision-making [9].Consequently, researchers have been relentless in their pursuit of automated and practical solutions for disease detection.Deep Learning (DL)-based models have emerged as formidable allies in this endeavor, transcending the limitations of traditional classification methods [4][5][6][7].They represent the vanguard of technology in the field of plant disease detection.DL, a sophisticated technique with a track record of success across diverse domains [10], operates by abstracting data at a high level through a series of transformative operations [11].This transformative approach heralds a new era in the quest to safeguard our crops and nourish our growing global population.
Agriculture constitutes a vital pillar of Pakistan's economy, contributing substantively to its Gross Domestic Product (GDP) by a proportion of around 19% and engaging approximately 38% of its labor force [12].This economic dependency underscores the pivotal role of crop health in sustaining food security and production.The primacy of cotton cultivation in Pakistan is noteworthy, as it is not only a principal cash crop, adding 0.6% to the GDP, but also contributes 2.4% to the value addition within the agricultural sector [13].Termed "white gold" in developing economies, the success of cotton growth is conditioned by both environmental factors and prudent human management practices.
However, diseases like the Cotton Leaf Curl Virus (CLCuV) pose serious threats to cotton yields.Timely and accurate disease detection is crucial for implementing appropriate management strategies.A noticeable gap exists in the prevailing methodologies for detecting the Cotton Leaf Curl Disease (CLCuD), which necessitate significant manual intervention.This underscores the exigency for a comprehensive, automated framework capable of discerning the disease across multiple scales, robustly navigating challenging contexts, and furnishing timely guidance for strategic interventions to mitigate economic losses.The transformative potential of deep learning and machine learning paradigms has significantly propelled the detection of plant leaf ailments, notably the Cotton Leaf Curl Disease (CLCuD) [14].These techniques harness sophisticated algorithms, adept at discerning intricate patterns within extensive datasets, thereby engendering the formulation of accurate and efficacious detection models.The current work aims to implement an optimized deep learning model to describe the state-of-the-art identification and examination of Cotton plant disease detection problems using a particular class of deep learning (DL) called CNN, which extends traditional Artificial Neural Networks (ANN) by adding more "depth" to the network and the various convolutions that enable the data to be successfully applied in various image-related problems [15].
In the domain of CLCuD detection, (CNNs) and analogous deep learning architectures possess the capacity to dissect leaf images, thereby unveiling markers indicative of CLCuD and adeptly discriminating between healthy and afflicted leaves.Additionally, the CLCuD detection sphere embraces machine learning mechanisms such as Random Forest, Support Vector Machines (SVM), and K-Nearest Neighbors (KNN).These algorithms scrupulously scrutinize foliar data to discern patterns characteristic of CLCuD, enabling the classification of leaves into healthy and pathological categories.Noteworthy is the endeavor of [16], who curated a dataset of cotton leaf images, subjected them to noise reduction and contrast enhancement, subsequently extracting texture structures using the Grey Level Co-occurrence Matrix (GLCM) and shape features via Histogram of Oriented Gradients (HOG).A Support Vector Machine (SVM) classifier, trained on these features, demonstrated efficacy.Similarly, [17] embarked on preprocessing a cotton leaf image dataset through resizing and grayscale conversion, leveraging the VGG-16 architecture coupled with transfer learning and data augmentation.The ensuing model's performance was assessed vis-à-vis metrics and contrasted with other deep learning models.Importantly, their model exhibited superior performance, further substantiated through sensitivity analysis across diverse disease categories.This research yielded a methodological blueprint for robustly identifying and classifying cotton leaf diseases, buttressed by deep learning models.Presently, the predominant landscape of CLCuD detection accentuates the scrutiny of macroscopic leaf attributes, while the nuanced micro and nano scales of disease progression often remain sidelined In the culminating phase of our endeavor, the image dataset undergoes a classification operation, facilitated by the CNN model, which is driven by meticulously calculated weights.The CNN paradigm is renowned and extensively employed in the realm of image recognition and classification, a fact well-established within the scholarly domain [4][5][6][7]18].The ambit of this research encompasses the design and instantiation of a Vision model characterized by a stratified architecture comprising seven distinct layers: conv2d_2, max_pooling2d_2, conv2d_3, max_pooling2d_2, flatten_1, dense_2, and dense_3.This dual-pronged model fabric has been meticulously conceived to effectively disentangle the nuanced domains of susceptibility and resistance pertaining to the Cotton Leaf Curl Disease (CLCuD), predicated on the discernible symptomatic attributes.This study focuses on exploring the complexities of building a robust dataset of cotton leaf images, investigating feature extraction techniques, investigating preprocessing techniques, and thinking about the potential uses of such datasets in disease diagnosis and agricultural research.
The holistic appraisal encompasses quintessential metrices, viz.training accuracy, testing accuracy, training loss, and testing loss.Notably, our proposed model's performance acumen on our in-house dataset is conspicuously prominent, characterized by an illustrious training accuracy index of 94.57%, complemented by a compelling testing accuracy record of 99%.
The present research makes a series of notable contributions, including but not limited to: • Recognize the many kinds of cotton leaf diseases and their frequency.• Recognize the connection between environmental elements and cotton leaf diseases.• Propose an automated CLCuD detection method that accurately classifies susceptibility scale levels by cotton leaf images.• Capture multiple images to build a self-collected dataset that classifies CLCuD based on visual symptoms.• Pre-processing steps are applied to the captured images, resizing images, normalizing image, augmentation, and the cleaning step start in which blur images are removed and the background of images is replaced with a neutral color for better visibility.• To extract features from images, apply preprocessing approaches.• To implement an optimized deep learning model for predicting susceptibility levels of the CLCuD self-collected dataset as well as the downloaded dataset.
Researchers may utilize this knowledge to create new strategies for avoiding and controlling cotton leaf diseases, and farmers can use it to recognize and manage diseases in their farms.The step-by-step process of how our models is to get input image, perform pre-processing steps, segmentation, features extractions, and classification in Fig. 2.

Literature review
Numerous studies have been done to develop methods that can help identify crops in an agricultural setting to find the best answer to the issue of crop disease detection.
Abade et al., akin to vigilant gardeners, meticulously examined CNN algorithms tailored for the detection of plant diseases.Their journey through 121 papers spanning the decade from 2010 to 2019 unveiled Plant Village as the quintessential dataset, while the resounding echo of Tensor Flow emerged as the most frequented framework in this botanical symphony [19].Dhaka et al., with the precision of botanists, sketched the blueprint of CNN models deployed in the identification of plant diseases through leaf images.They embarked on a comparative expedition, scrutinizing CNN models, pre-processing techniques, and the frameworks that cradle them.Their exploration extended to the fertile terrain of datasets and performance metrics, vital signposts in the quest for model prowess [20].Meanwhile, Nagaraju et al., akin to data archaeologists, embarked on a journey to unearth the gems of datasets, pre-processing techniques, and the mystical arts of Deep Learning in the realm of plant disease diagnosis [21].With an analytical lens, they dissected 84 papers, discovering that many DL methods yearned for the keen insight of suitable pre-processing techniques to unlock their full potential.Kamilaris et al., the pioneers of agricultural innovation, charted the course of DL approaches as they ventured into solving the multifaceted challenges of agriculture.Their findings, like fertile soil, revealed that DL methods outshone conventional image processing techniques, promising greener pastures in agricultural technology [22].
Fernandez-Quintanilla et al., the sentinels of weed control, focused their gaze on the evergreen battlegrounds of agricultural fields.They surveyed both the remote realms and the grounded territories of weed-monitoring technologies.Their prognosis: Weed monitoring is the linchpin in the battle for weed control, and the future holds the promise of harnessing sensor data, stored in the boundless expanse of the public cloud, to be wielded judiciously in the crucible of crop protection [23].Lu et al., the maestros orchestrating the symphony of plant disease classification, tuned their instruments to the resonance of CNN.Their composition evaluated the crescendos and diminuendos of CNN in the realm of plant disease classification, identifying the need for more intricate datasets to compose a harmonious melody [24].Golhani et al., the virtuosos of hyperspectral data, painted a portrait of leaf disease identification.They illuminated the challenges and the shimmering prospects in the realm of hyperspectral data.Their canvas extended to the realm of NN approaches, offering a swift brushstroke on SDI development [25].Bangari et al., the connoisseurs of disease detection, plucked the strings of CNN in the context of potato leaf diseases.Their review, akin to a masterful composition, resonated with the chorus that CNNs excelled in disease detection, weaving a harmonious tapestry of accuracy in the annals of plant pathology [26].

Fig. 2 Flow chart of this study
The landscape of disease detection methodologies has witnessed a dynamic transformation with the emergence of the dynamic weighted layering model, a promising Deep Learning (DL)-based technique geared towards detecting Cotton Leaf Curl Disease (CLCuD).Rooted in the utilization of multiple layers of nodes endowed with adaptable weights, this model is tailored to optimize performance, garnering considerable attention across diverse investigations.
In a related vein, Pechuho, Khan, and Kalwar [27] present an intriguing machine learning-based approach to the identification of cotton crop diseases.Their endeavors encompass dataset compilation, preprocessing, and the employment of CNN fortified by transfer learning, culminating in a superior classification proficiency that surpasses the ambit of alternative Machine Learning (ML) techniques across various evaluation metrics.
Conversely, the study by Tripathy [16] unfolds with the curation of a dataset housing cotton leaf images stratified across three distinct categories.Texture structures extracted via Grey Level Co-occurrence Matrix (GLCM) and shape features derived through Histogram of Oriented Gradients (HOG) serve as pivotal features.In this context, a Support Vector Machine (SVM) assumes the role of classifier, constituting a robust model for discerning and classifying instances within the dataset.A compelling manifestation of the nexus between DL and image analysis is encapsulated within Magsi, Shaikh, Shar, Arain, and Soomro's [28] research, wherein a dataset comprising 1600 images of CLCuD-infected cotton plants is harnessed.The model construction hinges on Information Processing (IP) techniques, facilitating the extraction of color and textual attributes.Propelled by a deep CNN, this approach navigates decision-making and classification tasks with finesse, yielding an efficacious framework capable of identifying the severity of CLCuD in a swift fashion.This outcome culminates in timely intervention strategies, instrumental in abating potential losses.
Zhu et al. [29] put forth an innovative methodology to decipher crop diseases, predicated upon the synergy of a Transformer Encoder and Centerloss optimization.This amalgamation fosters heightened feature analysis and accentuates differentiation between distinct diseases, triumphing over impediments like focal disease spot recognition and confusion engendered by resembling diseases.A novel approach surges to the forefront within Li, Wang, and Hu's [30] study, wherein a CNN architecture with three pivotal modules-feature extraction, classification, and augmentation-takes center stage.Propelled by residual connections and spatial pyramid pooling, this model systematically outpaces conventional ML techniques, an accomplishment vividly illustrated through a comprehensive array of evaluation metrics.
In a similar vein, Amin, Darwish, Hassanien, and Soliman [31] meticulously orchestrate a methodology underscored by feature abstraction and classification.Underpinned by data augmentation to recalibrate input data distribution, this approach employs a pre-trained VGG-16 model for feature extraction.The resultant model is subjected to meticulous hyperparameter tuning, culminating in a classification proficiency marked by accuracy and efficacy.
Equally noteworthy, Naeem et al. [17] delve into dataset compilation, preprocessing, and classification within the domain of cotton leaf disease identification.Leaning on the VGG-16 architecture with transfer learning and augmented data strategies, their model vaults ahead of rival deep learning models.The culmination of their study encompasses a sensitivity analysis tailored to ascertain the model's efficacy across diverse disease categories.Collectively, this panorama of literature unfolds a tapestry of diverse methodologies and approaches, coalescing around the common pursuit of robust and accurate disease detection techniques within the realm of agriculture.In the Table 1 different paper accuracy results are compared.

Methodology
Our proposed technique offers a systematic method for the rapid identification of the susceptibility scale levels of Cotton Leaf Curl Disease (CLCuD) using picture files.In the process, the Cotton Leaf Curl Virus (CLCuV) exhibits a variety of modifications that are categorized by looking at DNA patterning.Figure 3 provides a schematic representation of the process's steps.In the first stage of our suggested method, a wide range of pictures of cotton plants are taken, making it easier to create a self-curated dataset.The second phase then involves obtaining a publicly available dataset on cotton leaf disease from the Kaggle platform.The third step entails taking stand-out features from the image dataset.By using a variety of

Data collection
The images constituting the cotton leaf disease dataset were gathered from diverse cotton fields located in Multan, Pakistan.Specifically, the following locations were chosen: • Muhammad Nawaz Sharif University of Agriculture Multan (35% of the images).
These locations were carefully selected to ensure that the dataset accurately represented a range of cotton cultivation conditions and various levels of disease severity.To capture these images, a DSLR camera was employed, configured with the following settings: -Image formats: JPEG and PNG.
-ISO: 100 -Shutter speed: 1/100 second -Aperture: f/8 These settings were specifically chosen to yield highquality images that exhibit a consistent and standardized appearance across the dataset.

Dataset description
The dataset that used in this is given in the following (Tables 2, 3 and 4).Table 2 show the detail self-collected with number of training and testing images with each class.Fully Resistant (FR): This category encompasses images of cotton leaves entirely impervious to CLCuV.These FR cotton leaves exhibit no indications of CLCuV infection.Partially Resistant (PR): Within this class, Fig. 3 Proposed framework you'll find images of cotton leaves that possess partial resistance to CLCuV.PR cotton leaves may display mild CLCuV symptoms, but they can recover and yield normally.Healthy(H): This classification comprises images of robust and uninfected cotton leaves.Healthy cotton leaves maintain their natural green color and regular shape.Partially Susceptible (PS): In this category, you'll discover images of cotton leaves that display partial susceptibility to CLCuV.PS cotton leaves manifest moderate CLCuV symptoms, which may reduce their yield potential.Fully Susceptible (FS): This category features images of cotton leaves highly susceptible to CLCuV.FS cotton leaves exhibit severe CLCuV symptoms, often resulting in crop failure.
Table 3 shows the symptoms of self-collected dataset and Table 4 shows the number of training and testing images with each belonging class in downloaded dataset.
Table 3 shows that PS class has 233 and 52 images respectively for train and test, in this the leaf color become darker.The complete curl leaf cup shape upward and downward belongs to this FS class contain 277 train and 53 test images.
In the Table 4 a description about downloaded dataset is given in detail.It contains two classes Curl-Virus and healthy.

Preprocessing and feature extraction
After the pictures were shot, they were cleaned to get rid of blurry pictures and change the background color to something neutral.Various image processing methods were chosen to reduce noise and make the cotton leaves more visible in the photos.

Image resizing
To diminish the dimensions of images, a resizing function within the realm of Image Processing is harnessed, employing a computer vision library within the Python programming framework.The entirety of the cotton leaf images undergoes a harmonizing transformation, conformed to a standardized size of 256 by 256 pixels (width * height) as shown in Fig. 4. R represents the resized image.Re is the resizing operation applied to the original image.I stand for the original image.W O and H O are the width and height of the original image, respectively.W r and H r are the desired width and height for the resized image.

Image normalization
The process of aligning the intensity values of resized cotton leaf images with a predetermined range was successfully achieved through a procedure of normalization.This operation was executed within the context of the Python programming language, facilitated by the Google Colab platform.By subjecting the images to this normalization process, their intensity values were meticulously recalibrated to harmonize with the desired range.This recalibration engendered greater uniformity and coherence among the data, thereby fostering a more conducive environment for subsequent analytical undertakings.The equation encapsulating the essence of normalization is represented as follows: X represents the original data value.X N is the normalized value of X. X mn is the minimum value of the data.X mx is the maximum value of the data.This equation transforms the data value X into a normalized value that falls within the range of 0 to 1 based on its relationship with the minimum and maximum values of the dataset as display in Fig. 5.

Image augmentation
An amplified quantum of data becomes imperative to augment the efficacy of model training.In response to this requisition, the methodology of data augmentation assumes prominence.This approach entails the fabrication of novel data instances derived from the preexisting repository, thereby engendering a deliberate expansion of the dataset's magnitude.This augmentation endeavor encompasses a spectrum of operations administered to images, encompassing rotational manipulations, mirror reflections, spatial cropping, and the incorporation of perturbations or distortions.The assimilation of these manifold alterations into the dataset confers upon the model a more comprehensive and diversified pedagogical trajectory, thus amplifying its resilience and refining its performance throughout the training regimen as shown in Fig. 6 where D i , j denotes the instance after applying the j-th augmentation operation to the i-th original instance, and A j is the j-th augmentation operation.In this manner, the augmented dataset ′ D ′ is constructed, comprising a diverse collection of transformed instances that enhance the dataset's richness and facilitate improved model training.

Proposed model
Specifically, we employed CNN model.The purpose of employing these models was to assess the outcomes generated from both datasets.

CNN model
A common class of ANN used in image processing and recognition is referred to as CNN.Numerous advances in CNN architectures have been presented since the 1998 release of LeNet-5 [36].Additionally, learning relied on extracting interesting variables called features prior to the development of DL for computer vision.These techniques, meanwhile, necessitate a good deal of prior knowledge in image processing.With the advent of CNN [37], image processing was revolutionized, and manual feature extraction was done away with.The classification, segmentation, face recognition, and object recognition of images are increasingly frequently performed using CNNs.Many organizations have effectively used them in a variety of fields, including health, the web, postal services, etc. Images, video, sound, speech, and natural language can all be fed into CNN [38,39].Convolution, pooling, Relu correction, and fully connected layers are all simply stacked together to form CNN (see Fig. 7), which starts with a convolution layer and progresses Fig. 6 Sample of resized image and normalized image through the following layers: pooling, Relu correction, and finally a fully-connected layer [40][41][42][43].
The proposed CNN model is structured as a composition of seven distinct layers, namely conv2d_2, max_ pooling2d_2, conv2d_3, max_pooling2d_2, flatten_1, dense_2, and dense_3.Within the domain of Deep Learning, CNNs have attained a prominent stature due to their efficacy in tasks encompassing the classification of images, detection of objects, and segmentation of images.This model class operates through a process of feature extraction from input images via convolutional layers, subsequently facilitating a process of subsampling through pooling layers, and ultimately culminating in the process of classification through fully connected layers.The inherent architecture of CNNs conventionally integrates convolutive, pooling, and fully connected strata, which are further fortified by the infusion of normalization, activation, and regularization layers.This augmentation serves the dual purpose of optimizing performance and mitigating the challenges of overfitting.These intrinsic architectural attributes find explication within Fig. 7.It is noteworthy that CNNs have engendered remarkable progress within the domain of computer.
Vision, establishing their pervasive utilization across pivotal domains such as autonomous vehicular navigation, the analysis of medical imagery, and the domain of facial recognition.transformations, including the establishment of a consistent image size, potential conversion to grayscale or RGB color formats, and the normalization of pixel values to a standardized range of 0 to 1 as shown in Fig. 8. Input layer The input configuration of the proposed model encompasses two distinct sets of images, allocated to both the training and testing datasets.These input images undergo a preliminary preprocessing phase, ensuring their compatibility with the CNN architecture.This preprocessing entails several This preparatory stage facilitates optimal data representation and harmonizes the images for subsequent processing within the CNN framework.
Convolutional layer Our proposed model architecture integrates two essential convolutional layers, specifically referred to as con2d_2 and conv2d_3, as shown in Fig. 9.These layers play a fundamental role in extracting meaningful and pertinent features from the input images during the model's training phase.The initial layer, con2d_2, is responsible for receiving input images and subsequently transmitting the generated output to the subsequent processing stage.Through this process, the input images are transformed, yielding an output matrix with dimensions (64,64,32), accompanied by a parameter count of 896.Following con2d_2, the subsequent layer, conv2d_3, becomes operative.This layer takes input from the preceding max_pooling2d_2 layer and contributes to the derivation of output parameters.The input configuration for conv2d_3 is defined as (32,32,32), culminating in an output shape of (32,32,32).Consequently, the layer encompasses 9248 parameters.The mathematical formulation underlying convolutional layers can be succinctly expressed as follows: Where F[l,m,n,k] = Q Y is the output feature map, b is the bias, F is image features, Fm is input feature map.
Max pooling layer Max_pooling2d_2 and Max_ pooling2d_3, two crucial max pooling layers, are cleverly used to achieve dimensional reduction in image pixel representations.The second layer of our unique CNN model, called Max_pooling2d_2, receives input from the Conv2d_2 output matrix and manages the dimensionality reduction.A reduced output matrix with the dimensions (32,32,32) is produced by transforming the initial input matrix, which has (64,64,32) dimensions, as shown in Fig. 10.Our model's fourth layer, Max_pooling2d_3, acts as a bridge between the Conv2d_3 and Flatten_1 layer.It produces an output matrix with dimensions (16,16,32) by operating on an input matrix of shape (32,32,32), which is produced from the preceding layer.Following that, this matrix is propagated to the following layer for additional processing.The mathematical model demonstrating the functionality of the Max_pooling layer is expressed concisely as follows: where Ft is output feature map, Mp max_over _pool, Fp Input feature map, isd is istride, jsd is jstride.
(5) Flatten layer Flatten_1, situated as the fifth layer in the model, orchestrates a nuanced transition by molding the input matrix sourced from Max_pooling2d_3.This transformation yields a structured array, subsequently routed to the subsequent dense layer for intricate computational analysis.Characterized by dimensions (16,16,32) for the input matrix, it engenders an array with dimensions (8192).Crucially, the pivotal role of Flatten_1 resides in its intricate facilitation of neural network architecturean artful reconstitution of the input tensor into a refined unidimensional vector.The equation for the Flatten_1 layer can be written as: where y is output, I is input value, rep is reshape, bs is batch_size.

Dense layer
The bespoke CNN architecture incorporates a tandem of dense layers-Dense_2 and Dense_3.Serving as the sixth stratum, Dense_2 engenders a complexity of 1048704 parameters.It transmutes the input array stemming from Flatten_1, channeling it to the ensuing Dense_3 layer.Imbued with an input array characterized by dimensions (8192), Dense_2 orchestrates a reduction, yielding an output array of dimensions (156).This array, in turn, is propagated to the subsequent layer.In a parallel fashion, the ultimate and terminal where Y is output, a is activation, W is dot values, is input, wm is weight _matrix, bv bias vector.
Output The output is meticulously mapped across susceptibility to resistance levels of CLCuD, defining discrete data classes corresponding to distinct scale levels.

Experiments and result
The experimental configuration entails the deployment of a 238 GB Solid State Disk and a motherboard with 12 GB of RAM.The system is operational with Windows 10 Pro as the operating system, supported by an Intel (R) Core (TM) processor.The experimentation environment further incorporates the utilization of Google Colab platform, Python programming language, and the availability of a Google Colab GPU.

Evaluation matrix
In the context of assessing machine learning models, commonplace performance metrics encompass accuracy (8) Y = a(W (i, wm) + bv) Fig. 10 Pooling function of CNN model and loss.The formulation denoting accuracy finds prevalent application as a quintessential measure for evaluation.
The count of accurate predictions represents instances where the model's output coincided with the true anticipated result.Conversely, the aggregate count of predictions pertains to the cumulative instances upon which the model's predictions were applied.The formulation governing the loss metric is contingent upon the intrinsic nature of the problem under consideration.For classification problems, a commonly used loss function is crossentropy loss, which is defined as: where y_pred = yp.Within this context, N signifies the count of instances subjected to analysis.The variable y denotes the veritable label, encompassing binary values (9) Accuracy = T P + F P T P + F P + T N + F N (10) of either 0 or 1.Conversely, y_pred encapsulates the predictive probability attributed to the affirmative class.

Comparison of accuracy
The scrutiny encompasses a comparative analysis of accuracy metrics between the self-collected dataset and the downloaded dataset.
Table 5 provides a comprehensive exposition of the training and testing accuracy measurements pertaining to both datasets, namely the self-collected and downloaded datasets.Evidently, the tabulated results underscore that the self-collected dataset yields a training accuracy of 94.57% coupled with an impressive testing accuracy of 99%.Conversely, the downloaded dataset demonstrates commendable training accuracy of 97.49%, juxtaposed with a testing accuracy of 89.71%.Figure 11 serves as an illustrative repository, visually presenting the graphical depictions of training accuracy and testing accuracy for both datasets.
Within this tabulation, the initial column encapsulates the accuracy manifestations derived from the self-collected dataset, while the subsequent column delineates the accuracy metrics originating from the downloaded dataset.The x-axis of the accuracy graphs delineates a spectrum spanning from 0 to 20 epochs, parameters germane to the custom CNN model's utilization.Concurrently, the y-axis offers a continuum ranging from zero percent to one hundred percent, symbolizing the accuracy spectrum.The graphical representation is

Comparison of loss
The CNN model which painstakingly compiles the training and testing loss metrics for both the self-collected and downloaded datasets.Contained within Table 6 is a comprehensive exposition of the training and testing loss metrics, effectively encapsulating the intricacies of both the self-collected and downloaded datasets.
Noteworthy is the discerning insight provided by the tabulated data, revealing that the self-collected dataset exhibits a training loss of 16.38%, accompanied by a relatively modest testing loss of 6.56%.In stark contrast, the downloaded dataset showcases a more favorable training loss of 8.36%, while the testing loss exhibits a notably higher magnitude of 52.67%.Figure 12 delineates graphs portraying training and testing loss for both datasets.The initial column pertains to the self-collected dataset's loss, while the second column pertains to the downloaded dataset's loss.The x-axis portrays epochs ranging from 0 to 20 for the custom CNN model, and the y-axis signifies loss from 0 to 100 percent.Blue lines represent training loss, while orange lines depict testing loss.Notably, the self-collected dataset demonstrates declining loss as epochs increase, concurrent with an ascending accuracy trend as shown in Fig. 13.The experimental outcomes are as follows: • The application of preprocessing procedures encompassing background removal, data augmentation, and image resizing enhances the dataset's quality.• Expert annotation of the Self-Collected dataset, guided by meticulous disease symptom assessment, facilitates the establishment of a susceptibility scale level mapping.• The utilization of the Self-Collected dataset yields superior model performance outcomes when contrasted with the downloaded dataset.
Recent advancements across various scientific and engineering disciplines are well-captured through a series of innovative studies.Zhou et al. [44] and Qi et al. [45] have made significant strides in remote sensing and image processing.Zhou et al. [44] developed a novel method for LiDAR hidden echo signal decomposition, while Qi et al. [45] enhanced image quality through a brightness correction algorithm.Lin et al. [46] demonstrated the application of AI in practical fields like construction and infrastructure, focusing on pavement anomaly detection.The study by Y. et al. [47] illustrates the potential of data analytics in knowledge graph completion, enhancing information systems.
In the realm of electronics and communication, Jiang and Li [48] improved interference cancellation systems, and Wang et al. [49] developed a biosensor system for explosive detection, showcasing interdisciplinary research.Liu et al. [50] and Dang et al. [51] leveraged AI in creative and analytical applications, with the former creating photo-realistic images from sketches and the latter developing a feature matching method based on convolutional neural networks.Healthcare technology saw advancements with the surgical instrument localization algorithm by Siyu Lu et al. [52], indicating progress in surgical precision.In environmental applications, Cheng et al. [53] utilized machine learning for vegetation mapping, and Zheng et al. [54] in agricultural monitoring.Tao et al. [55] applied AI in defect recognition, while Zhou et al. [56] and another study by Zhou et al. [57] focused on enhancing signal detection in remote sensing.Object and vehicle detection were advanced by Zhang et al. [58] and Li et al. [59], respectively.
Predictive maintenance was addressed by Zhao et al. [60] in aero-engine life prediction.Control systems for multi-agent and autonomous underwater vehicles were explored by Hu et al. [61] and Chen et al. [62].Liao et al. [63] and Ding et al. [64] applied AI to detect fake news and taxi fraud.Zhang et al. [65] explored cloud management systems, and network resource allocation and wireless communication were the focus of Xuemin et al. [66] and Lyu et al. [67].Traffic anomaly detection by Xu et al. [68] and transportation detection by Chen et al. [69] show the breadth of technology's impact, while Ma et al. [70] enhanced pavement assessment techniques.Lastly, Jin et al. [71] created a dataset for image quality assessment, reflecting the diversity and depth of current technological advancements across sectors.
The study presented here, focusing on the use of deep learning models, specifically Convolutional Neural Networks (CNN), for the detection of Cotton Leaf Curl Disease (CLCuD), marks a significant advancement in agricultural technology, particularly in crop disease management.By successfully harnessing deep learning to analyze a comprehensive dataset of cotton leaf images, the research addresses a critical challenge in agriculture, especially in regions with limited resources.The high accuracy rate of 99% achieved by the CNN model underscores the potential of AI in transforming disease detection processes, leading to more efficient and timely interventions.This approach not only contributes to improving cotton yields by mitigating the impact of diseases like CLCuV but also paves the way for integrating Fig. 13 Comparative graph of accuracy of both datasets after augmentation advanced technologies in agriculture.The methodology and findings of this study could be instrumental in developing automated, accurate, and accessible disease detection systems, thereby enhancing crop management and supporting sustainable agricultural practices globally.

Conclusion
The symphony of ecological forces is essential to the complex effects of plant growth and development.Stickiness, temperature, water availability, sunlight, and nutrition are just a few of the variables that play a role in the environmental orchestra that determines a plant's journey.The threat of crop diseases lurks menacingly within this happy coexistence, posing a significant threat to agricultural output and global food security.Problems with illness detection are frequently worsened by insufficient infrastructure in many parts of the world.Enter the centerpiece of our project: a carefully prepared dataset on cotton leaf disease.This data collection is a gold mine a world of inventive possibilities.Using AI for Early Detection: Visualize a world in which cutting-edge machine learning models, cultivated on this dataset, serve field sentinels.They quickly and precisely identify the unmistakable symptoms of cotton leaf diseases.These digital guardians, which are integrated into mobile apps and web services, provide farmers with real-time assistance, reversing the tide in the war against crop diseases.Beyond practical uses, this dataset serves as a fruitful research environment for the advancement of science.Armed with this wealth of information, researchers set out to solve the riddles of cotton leaf diseases.They make clear the complex connections between the frequency of diseases and the dynamic environmental web.Their results open the door for creative ways to safeguard crops and guarantee food security.Education and Empowerment: By illuminating the road for farmers, the dataset serves as a cornerstone of education.Resources start to appear that are vibrant, like interesting pamphlets and posters.By teaching others about cotton leaf diseases and the art of detection and management, they take on the role of teachers.With their newly acquired knowledge, farmers protect their crops, fostering abundant harvests.Switching gears, we go into the world of Cotton Leaf Curl (CLC) Disease, a scourge that plagues cotton plants and is brought on by the Cotton Leaf Curl (CLC) Gemini virus.Deep learning technology is emerging as a strong force that cuts across numerous industries in the large field of agricultural research.Here, the scene is ready for the CNN models to identify pictures of cotton leaves.Two datasets take center stage, one chock full of Cotton Leaf Curl Disease-related photos and the other a self-assembled treasure trove of 842 and 1349 images, respectively.The power of the CNN is amplified by augmentation techniques, providing thorough training.The data, which includes training, testing, loss, and accuracy, paints an interesting picture.Within the self-assembled dataset, CNN achieves an astounding 99% accuracy, demonstrating the power of deep learning.The voyage is still far from finished.The horizons of the future hold out hope for progress.The first steps toward bigger accomplishments include a wide range of ecological circumstances, an expanded dataset including various regions, and model improvements.This dataset is a priceless resource for researchers and programmers exploring the boundaries of cotton leaf disease detection, prevention, and management.It shines as a beacon of information, directing varied minds toward the objective of protecting our essential crops thanks to its balanced distribution, spanning both the sick and the healthy, and its open accessibility.
The voyage is still far from finished.The horizons of the future hold out hope for progress.The first steps toward bigger accomplishments include a wide range of ecological circumstances, an expanded dataset including various regions, and model improvements.
. The intrinsic disposition of our self-curated dataset culminates in 252, 132, 350, 285, and 330 instances for the respective classes of Fully Resistant (FR), Partially Resistance (PR), Healthy (H), Partially Susceptible (PS), and Fully Susceptible (FS) as depicted in Fig. 1.The assemblage of our self-collected dataset encompasses 1349 images spanning five distinct classes (FS, PS, H, PR, FR), where 1052 images are earmarked for training and 297 for testing.The dataset available to the public is characterized by a binary distinction, with one class comprising 418 instances associated with the Curl-Virus category, and the other class containing 426 instances denoting the Healthy category.Transitioning into the subsequent phase, we adopt an exhaustive array of preprocessing techniques, orchestrated to extract multifarious features from the encompassing image dataset.This elaborate feature extraction process serves as a precursor to our utilization of two distinct Deep Learning (DL) models, specifically CNN, for the fundamental task of classifying the Cotton Leaf Curl Disease (CLCuD).

Fig. 1
Fig. 1 CLCuV susceptibility scale level preprocessing approaches, such as image resizing, image normalization, and image augmentation, this extraction process is made easier.In order to classify CLCuD, these processed datasets are then fed into a Deep Learning (DL) model, specifically a CNN.The model is expanded in the last stage of our suggested methodology to categorize susceptibility scale levels into five different groups: Fully Resistant (FR), Partially Resistant (PR), Healthy (H), Partially Susceptible (PS), and Fully Susceptible (FS).The weights associated with the used model have been calculated and used to give these categories.Based on measures for both training accuracy and testing accuracy, the CNN model's performance is assessed.This thorough methodology guarantees reliable detection and categorization of CLCuD susceptibility levels and may have implications for improving disease control tactics in cotton farming.Following steps are exist in the proposed frame work Self-collected dataset description, downloaded dataset description, feature extraction and preprocessing, construction of CNN model and results.

Fig. 4 Fig. 5
Fig. 4 Sample images of self _collected dataset . D represent the original dataset containing N instances.Data augmentation introduces a set of M augmentation operations denoted as A = {A 1, A 2, A 3, ….A M }, each representing a specific data transformation.This results in an augmented dataset ′ D ′ containing N × M instances, where n*m = W: where ′ D ′ is the augmented dataset, D i is the i-th instance after augmentation, and M represents the number of augmentation operations.Each augmentation operation A j modifies an instance D i according to the operation's characteristics, producing a transformed instance D I j: (3) D′ = {D1, D2..., DW} (4) Di, j = Aj(Di)

( 7 )
Y = i.rep(s, − 1) layer, Dense_3, concludes the model's architecture, encapsulating 645 parameters.Drawing its input from Dense_2, furnished with an input array bearing dimensions (156), Dense_3 undertakes the task of mapping it onto 5 distinct classes.This predictive mapping serves to ascertain the susceptibility scale level of CLCuD.The mathematical essence of the Dense layer finds articulation in the ensuing equation.

Fig. 11
Fig. 11 Accuracy graph of both datasets

Table 1
Comparative results of models

Table 2
Detail of self-collected dataset The self-collected dataset contain 5 classes FR class indicate the symptom of enation of small leaves contain 186 train images and 66 test images.The class PR has 77 train images and 55 test images specify vein thickness in leaves.Train images 279 and test images 71 belongs to class H

Table 3
Self-collected dataset class symptoms

Table 4
Detail of downloaded dataset

Table 5
Training and testing accuracy

Table 6
Training and testing loss Fig. 12 Loss graph of both datasets