An integrated SDN framework for early detection of DDoS attacks in cloud computing

Cloud computing is a rapidly advancing technology with numerous benefits, such as increased availability, scalability, and flexibility. Relocating computing infrastructure to a network simplifies hardware and software resource monitoring in the cloud. Software-Defined Networking (SDN)-based cloud networking improves cloud infrastructure efficiency by dynamically allocating and utilizing network resources. While SDN cloud networks offer numerous advantages, they are vulnerable to Distributed Denial-of-Service (DDoS) attacks. DDoS attacks try to stop genuine users from using services and drain network resources to reduce performance or shut down services. However, early-stage detection of DDoS attack patterns in cloud environments remains challenging. Current methods detect DDoS at the SDN controller level, which is often time-consuming. We recommend focusing on SDN switches for early detection. Due to the large volume of data from diverse sources, we recommend traffic clustering and traffic anomalies prediction which is of DDoS attacks at each switch. Furthermore, to consolidate the data from multiple clusters, event correlation is performed to understand network behavior and detect coordinated attack activities. Many existing techniques stay behind for early detection and integration of multiple techniques to detect DDoS attack patterns. In this paper, we introduce a more efficient and effectively integrated SDN framework that addresses a gap in previous DDoS solutions. Our framework enables early and accurate detection of DDoS traffic patterns within SDN-based cloud environments. In this framework, we use Recursive Feature Elimination (RFE), Density Based Spatial Clustering (DBSCAN), time series techniques like Auto Regressive Integrated Moving Average (ARIMA), Lyapunov exponent, exponential smoothing filter, dynamic threshold, and lastly, Rule-based classifier. We have evaluated the proposed RDAER model on the CICDDoS 2019 dataset, that achieved an accuracy level of 99.92% and a fast detection time of 20 s, outper-forming existing methods.


Introduction
In the last decade, several researchers and developers have made a great effort to develop new computing technologies, creating a very complex digital environment where users can efficiently perform a range of jobs quickly and at a low cost.These technologies give consumers on-demand access to various services and resources.Cloud computing provides a digital platform for cloud users to access resources on demand based on a pay-per-use model [1].Even the government and IT industries have shifted their focus to the cloud because it reduces the cost of infrastructure development and management.Virtualization is a critical technology in cloud computing as it gives service to a set of dynamically usable resources, such as storage, software, and processing power, over the internet [2].Monitoring network traffic in a stable network structure presents a significant challenge for cloud providers.As a result, companies have turned to Software Defined Networks (SDN) as a preferred Page 2 of 22 Songa and Karri Journal of Cloud Computing (2024) 13:64 method for building networks over the past decade [3].SDN simplifies the complexity of today's networks by converting physical network connections into logical network connections and providing centralized management of network services [4].Cloud service providers can benefit from cost savings, intelligent global links, granular security, and reduced downtime with SDN [5].SDN provides a software application plane for applications that offer practical solutions to essential network operations such as auto-scaling, intrusion detection, and network monitoring [6].The development of SDN cloud networking, as depicted in Fig. 1, allows cloud service providers to host millions of virtual networks without relying on standard isolation methods such as VLAN.SDN represents a paradigm shift in network architecture.It decouples the control plane from the data plane, allowing network administrators to dynamically manage and control network traffic.In the context of cloud computing SDN enables the dynamic allocation of network resources to match the requirements of cloud applications and services.This means that as workloads in the cloud increase or decrease, the network can adapt accordingly, ensuring optimal performance.However, although SDN separates the control and data planes, it does not prevent network overload from traffic, resulting in DDoS attacks.Additionally, hackers can compromise the network's security by attacking several SDN components, including the controller, southbound and northbound interfaces, and the switch [7].SDN-based cloud users face a significant issue with service disruptions caused by DDoS attacks.
A Distributed Denial-of-Service (DDoS) attack is probably the most well-known and dangerous threat to cloud computing.It can hurt both cloud providers and their customers.DDoS makes the help inaccessible to actual clients.Multiple nodes are compromised to generate the attack.The malicious user compromises multiple nodes to flood the target system with traffic [8].A sample attack scenario is represented in Fig. 2. Recent estimates show DDoS attacks cause enormous financial losses for even the largest cloud providers, such as Amazon AWS EC2 and Rackspace [9].DDoS attacks on servers and the infrastructure of the cloud [10].Cybercriminals conducted around 5.4 million DDoS attacks in the first half of 2021, registering an up to 11% Fig. 1 SDN cloud networking increase from the first half of 2020.An organization's ability to recognize and defend itself against DDoS is critical to its success.
Therefore, it is crucial to have a framework that can analyze network traffic and detect anomalies before any damage occurs [11].An automated system that can classify network traffic and alert the controller is necessary [12,13].Although several DDoS defense systems are available, attackers continuously develop new attack patterns, making it challenging to detect anomalies early.While some existing strategies provide early detection, they have high false-positive rates [14].Other approaches have high accuracy and detection times for DDoS but they can lead to resource outages and financial losses [15].We introduce an innovative RDAER framework that seamlessly incorporates highly effective techniques within each category, including feature selection, traffic clustering, attack prediction, and traffic classification for SDN-based cloud networks.This integration aims to enhance the precision and timeliness of DDoS attack detection.To achieve this, we have incorporated the following techniques into our proposed framework:  • Lastly, all the cluster scores are correlated using a rule-based event correlation classifier to determine whether traffic data is normal or a DDoS attack has occurred.• We evaluate the effectiveness of the proposed RDAER framework by comparing it with the existing models in terms of accuracy and detection time.The results indicate that the RDAER framework outperforms accuracy and detection speed methods.
The commitments of the paper are as follows: Sect. 2 discusses related work.Section 3 deals with the proposed methodology.In Sect.4, evaluation and experimental results have been explored, and the conclusion and future work are presented in Sect. 5.

Related work
In the past few years, scientists have presented several strategies for intrusion detection systems but barely any procedures for anomaly detection.These strategies face the challenge of creating a varied, flexible, and straightforward approach for abnormal behavior detection, given the complexity and speed of today's malicious behavior and the size of today's networks.The anomalies over a network can be detected using different intrusion detection techniques, namely mining, statistical, machine learning, and knowledge-based methods.Since early 2010, these techniques have been implemented individually to detect attacks leaving increased false positive rates [16,17].In 2015, as we have advanced to the next research phase, we can detect DDoS attacks by combining two intrusion detection approaches [18].These studies have improved detection accuracy but at the cost of greater computational complexity and resource usage.
Later, a study was conducted on detecting and choosing suitable features which help decrease the detection time, to simplify matters [19].Another model [20] employed several time series techniques for predicting DDoS attacks by forecasting the behavior of the traffic features at the time of attack using the anomaly scores.The model identified traffic as an attack or normal based on the scores.This paper [21] presents a new method for detecting DDoS attacks using Lattice Structural access rates; it is named S2RF2S for feature filtering, and it makes use of a Soft-Max Behavioral Based Ideal Neural Network (SxB2IN2) for classification.The work achieved an accuracy of 90% in detecting DDOS attacks.Logistic Regression, Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Random Forest (RF), and Long-Short Term Memory (LSTM) algorithms were used in another study [22] to emphasize feature selection for accurate and efficient identification.Compared to previous research, the RF classifier could achieve 99% DDoS detection accuracy with 11 features.In [23], a Spark tool was used to build a model for detecting DDoS attacks in SDN.In comparison to other algorithms, the Decision Tree (DT) has shown the best accuracy at 93.6%.Hence, DT was selected for real-time deployment.In another study [24], the widely used LSTM model was used to filter out suspect flows in distributed SDN-based edge computing.Extensive experiments on five different datasets with three common attack types demonstrated that the CoWatch framework achieved an accuracy of 93.30% in predicting and detecting DDoS attacks and their corresponding attack flows through a collaborative prediction algorithm.
Najafimehr et al. [25] used a hybrid model that combines both supervised and unsupervised learning methods.They utilized a clustering approach and many flow-based characteristics to distinguish attack traffic from regular data.The clusters are given by names using a classification method based on specific statistical measures.Phuc Trinch et al. [26] suggested an enhanced approach for detecting hacked SDN switches using a multivariate time series technique and Recurrent Neural Networks (RNNs) for classification.This work achieved an accuracy and detection rate of 96.99% and 98.51, respectively.Peng et al. [27] proposed an anomaly flow detection for SDN using the double P-value of the transductive confidence machines for the KNN algorithm.Using a samplingbased strategy, another study [28] proposed a scalable flow monitoring and classification solution for open flow.The classification method combines deep packet inspection and machine learning techniques.This paper [29] proposes use of machine learning to mitigate cloud-based DDoS attacks.The work covers gathering cloud module input data, reducing dimensionality, noise filtration, feature extraction, and ResNet-101-based Kernel Extreme Learning Machine (KELM) for classification.
In another study [30], the authors used agglomerative and K-means clustering with Principal Component Analysis (PCA) for feature extraction.A voting method classifies whether the data is normal or attacked.This method achieved a classification accuracy of 96.66%.Mosayeb et al. [31] proposed a 3-phase statistical model RAD to detect DDoS attacks by scoring users to classify them as attack or benign.The three parameters, drop, jitter, and delay, identify the potential attack behavior.RAD is tested using the CICDDoS2019 dataset and is compared to four other detection algorithms that achieved a precision of 80% and a recall of 99%.Rajasree et al. [32] used a fuzzy bat clustering algorithm by grouping similar patterns and predicted the strange behavior by deviated anomaly score.The event correlation between the virtual machine instance supplied by the cloud service provider and the suspicious source list is established to identify the malicious source.This model has produced fewer false alarm rates when accurately determining the anomalies.Girish et al. [33] constructed a neural network using stacked and bidirectional LSTM models.Information collected from the open stack is used for testing the model.The information gathered includes ten characteristics and a classification label.Using the binary cross-entropy function as a loss function, the suggested model had a training set accuracy of 94.61% and a test set accuracy of 93.98%.
The analysis of existing studies highlights the absence of a comprehensive strategy that integrates clustering, time-series analysis, feature selection techniques, and event correlation for early DDoS attack detection in SDN cloud environments.Event correlation plays a pivotal role in identifying network patterns and anomalies across distributed networks.Additionally, there is a need to improve the DDoS detection accuracy.The novelty of our work lies in the combined application of clustering, time-series techniques, and event correlation, with a particular focus on the unique CICDDoS dataset.This dataset is essential for discovering new attack patterns that may not be present in older databases, emphasizing the importance of its utilization in uncovering unexplored threat scenarios.Table 1 compares existing works as well as their limitations.

Proposed system
The RDAER framework is designed for SDN SDN-based cloud environment.It comprises five modules: data preprocessing, feature selection, clustering, anomaly score prediction, and event correlation-based classification.The architecture of the RDAER is presented in Fig. 3.The SDN controller employs this approach and monitors each switch individually for DDoS attack traffic irregularities.The SDN agents at switches perform data preprocessing to convert the raw traffic from network flows and process it for normalized data.Using Recursive Feature Elimination (RFE), relevant features (Source IP address, Destination IP address, and timestamp) are chosen and then formed into clusters based on timestamp using the DBSCAN approach.Then, using time series techniques, each cluster is analyzed for any malicious traffic by releasing anomaly scores.Finally, the event correlation module correlates the final anomaly scores to classify the traffic sample as normal or DDoS.When a sample is abnormal, the framework sounds an alarm and activates the countermeasure section.Each module is briefly explained below.

Data preprocessing
In machine learning, data preprocessing is crucial in generating accurate and valuable results [34].Data preprocessing improves data quality by handling missing or incomplete data, smoothing out noise, and addressing discrepancies.The following steps are involved in the preprocessing stage: 1.The correlated features get removed by selecting only one feature among many with a > 80% correla- tion.The Pearson correlation coefficient is employed, which gives a value between -1 and + 1 and can determine if two features have a linear relationship.The covariance of two features (p, q) is calculated using Eq. ( 1), where the cov (p, q) represents the covariance between two features.In contrast, σp and σq represent the standard deviation of p and q, respectively. (1) To remove any incomplete data, we need to eliminate the rows that have missing values.3. To replace infinite values with a maximum feasible value.4. The data is normalized using the min-max scaling method, which involves applying the equation specified in Eq. ( 2).Here, z represents the value of a feature fe, while z' denotes the corresponding normalized feature value.The minimum and maximum values of the feature are denoted as min fe , max fe , respectively.

Feature selection
Following the preprocessing phase, we perform feature selection on normalized data using the hybrid RFE approach described in the current paper.As DDoS attack patterns are in huge and high dimensional data overfitting, poor model interpretability, and longer calculation times are all possible consequences associated with it.
The effectiveness of DDoS detection algorithms can be enhanced by using RFE to pick a subset of the most useful characteristics, thereby lowering the dimensionality.RFE constructs a model and selects the optimal or worst features based on their ranks, using a basic DT method as an estimator [35].This method employs information entropy as a crucial measure for feature selection.It computes the information gain for each sample to divide it layer by layer until at least one sample type is separated [36].The RFE with DT as an estimator provides rankings and importance scores for all features, as shown in Fig. 8. Based on the threshold, the source IP address and destination IP address have the highest ranking of all features and are selected for our work on DDoS detection.During a DDoS attack, the number of flow entries with Unique Source IP Address (USIA) may grow due to fake and randomly produced IP addresses.In contrast, the number of Normalized Unique Destination IP Address (NUDIA) may not vary much compared to usual.Still, the normalized value of this statistic concerning the total number of packets in the flow table decreases.These two features are, therefore, independently used as time series in the attack detection procedure to identify potential cases of a DDoS attack.In addition to these two features, timestamps and class labels are also considered in this work, as they are associated with IPs.The feature list considered for our work is tabulated below in Table 2.

Clustering
After feature selection, we cluster the selected features using the DBSCAN algorithm [25].DBSCAN is a robust technique that can effectively control dynamic DDoS attack patterns, mitigating noise, and optimizing parameter settings.Though it may lead to increased energy consumption, it is still a viable option.Its versatility and adaptability make it an invaluable tool for studying and comprehending energy consumption patterns in dynamic time-series data, which enables the discovery of important clusters.This is made possible as DBSCAN possesses both of these qualities.The flow chart of DBSCAN is depicted in Fig. 4. Before clustering the data with DBSCAN, we first calculate the optimal value for the 'eps' parameter using (3), which  retrieves all points that are densely reachable from point y, considering eps and minpts.
If point y is a core point and Neps > minpts, we form a cluster and join the cluster's core point.We identify all border points with Neps, minpts, and all core points as neighbor's and mark the other points in D as noise points.We divide the features of USIA and NUDIA into different clusters based on timestamps.Then, each feature is processed separately using time series techniques in the next phase to calculate anomaly scores and detect DDoS attacks.However, each cluster is processed in parallel to accelerate the detection of DDoS attacks.

Anomaly score prediction module
In this phase, we separately analyze each cluster's USIA and NUDA features to determine their anomaly scores (score 1 and score 2 ) at time t.The anomaly prediction module is illustrated in Fig. 5.We apply the USIA feature to both the ARIMA and chaos theory methods to obtain score 1 .For the NUDA feature, we pass it through exponential filters and dynamic threshold to get score 2 .The notation used in the algorithms is tabulated in Table 3.

Processing of USIA feature
As said earlier USIA is passed as time series to ARIMA.ARIMA (p, d, q), is a three-tuple time-series forecasting statistical model [37], where p is the lag order, d denotes the number of times raw observations differenced, and q is the order of Moving Averages (MA) or lagged forecast errors as seen in Eq. 4.
(4)  In Eq. ( 4), the term Z ′ (t) is a time series, φ 1 and theta 1 are the first Auto Regression (AR) and MA terms, and p and q are the order of AR and MA terms, respectively, and finally, ε t is the error.ARIMA captures the trends and seasonality of the network traffic and allows checking for any spikes and fluctuations in the traffic.Spikes relate to abnormal traffic.To find an inaccuracy in prediction error, we use the Lyapunov exponent, as depicted in Eq. ( 5) below: where p 0 , p t and λ t represent the first prediction error, the t th prediction error, and the Lyapunov exponent at the t th instance, respectively.Positive exponents imply DDoS traffic, while negative exponents indicate regular traffic [38].
According to algorithm 1, the ARIMA model estimates the attack trend of the sample set Z, where z t is an exponential function of time t.To construct the ARIMA model, set Model 1 to be true.For training the model, n samples of source IPs are stored in Z.If Z is non-stationary, apply differencing d > 1 to achieve a stationary time series.Differentiating the time series makes Z suitable for stationary time series analysis and modeling.The Box-Cox transformation stabilizes the variance of a time series variable y.The Box-Cox transformation is a mathematical technique that adjusts the data distribution to make it more suitable for analysis (5) t = 1/tln(|pt/p0|) and modeling.It is also possible to use either Akaike's information criterion, the corrected version of this criterion (AICc), or the Bayesian Information Criterion (BIC) to select the order model [39].By minimizing these criteria, the best model was selected.Figure 6 is the differencing graph that shows in which order of differencing data is stationary and Fig. 7 specifies whether there are any outliers for the order selected.The peaks in the density plot specify the anomalies.The ARIMA model generates the standard feature pattern, but no attack instances should occur during the model generation.After model generation, the Model 1 flag is set to FALSE, and the training phase is completed.In the testing phase normal model estimates the value, ẑ t , for each subsequent incoming traffic sample, z t .If any attack traffic is coming the model predicts an abnormal behavior by generating the spikes.The prediction error's chaos calculates an anomaly score [40].The prediction error p t is determined using Eq. ( 6) To assign an anomaly score for different outcomes of prediction errors, the Lyapunov exponent (λ) is used.According to Eq. ( 5), a positive value of λ indicates attack traffic (score 1 = 1), while a negative value suggests normal traffic (score 1 = 0).(6) pt = (z t − z t )

Processing of NUDA feature:
Here exponential smoothing forecasting model is used, which gives weights to the earlier and new observations for forecasting.New observations have higher weight than earlier observations based on the smoothing constant α, which makes things look smoother.The value of α is between 0 and 1 as per Eq. ( 7) where E i signifies the smoothed data and x denotes the original data.
In algorithm 2, the n samples of the NUDIA feature are used and stored in Y as a time series to generate the model.Y is estimated by two exponential filters, f1, and f2, with their exponential constants α 1 as 0.1 and α 2 as 0.8 and their absolute difference stored in Ad f .The rolling median generates a median time-series, M. The least distance between each case and the remaining samples is determined and stored in a set, ld.The mean µ ld and standard deviation σ ld are computed.Model 2 is set to False once the above-stated values (7 7 Line plot and density plot of residuals are determined.Now for each y t feature of upcoming traffic, the process mentioned above is repeated, and the least distance ld t is calculated.If it is less than the threshold value η = µ ld + q*σ ld, the traffic instance is considered normal (score 2 = 0); otherwise, it is abnormal (score 2 = 1).

Algorithm 2 Prediction of anomaly score 2 using smoothing filters
The anomaly score prediction module collects score1 and score2 from the above methods and performs ANDing operation to obtain the final anomaly score f.All the collected final scores from each cluster are fed to next the module for DDoS detection.

Event correlation-based DDoS detection
The utilization of a rule-based method for network event correlation is very important in the identification of DDoS assaults in network settings.The methodology encompasses the gathering of data from multiple network nodes, performing preprocessing to assure uniformity, and afterward using predetermined correlation rules specifically designed to detect patterns indicative of DDoS attacks.These rules look at the spatial, temporal, and rate-based parts of network traffic, keeping an eye out for sudden traffic spikes, strange protocols, or high resource usage.A rule triggers an alert describing the nature and severity of potential DDoS activity.This alert initiates further research to protect network resources.These rules are updated based on real-world incidents and emerging threats to provide proactive and adaptive DDoS detection and prevention.Due to the event correlation, detecting the attack traffic too early with reasonable accuracy is possible, which may reduce the economic loss and huge damage to resources in the cloud network.
According to algorithm 3, calculate the threshold η for classifying abnormal and normal traffic.The rule-based classifier function calculates the sum of final anomaly scores for all clusters and checks if it exceeds the threshold to determine the traffic type.The main loop simulates the continuous processing of incoming traffic samples.Inside the loop, anomaly scores are calculated for each cluster and collected in the cluster scores list.If abnormal traffic is detected, an alarm is raised by the controller.The corresponding IP address and its corresponding switch are added to the discarded list.A defense mechanism can stop a DDoS attack but also stop any packets sent to a victim's IP address.As a result, immediately after the attack, the controller must change the activity of the flow entries.

RDAER working model in cloud environment
The RDAER framework showcases strong adaptability to the scalability challenges within expansive SDN-based cloud networks.Acknowledging the increasing presence of multiple controllers, switches, and routers as the network expands, the framework is tailored to accommodate the network's growth.Its design leverages the hierarchical structure of SDN-based cloud networks, enabling robust event correlation at different network levels.Event correlation ensures early and accurate detection of malicious traffic, effectively reducing false positives.By proactively addressing scalability concerns and adapting to complex, multi-tiered network infrastructures, the framework demonstrates its capacity to maintain efficiency and accuracy even in the face of substantial network expansion.
The RDAER framework suggests a decrease in computational resources even when network size grows.This reduction is due to the utilization of only two features for analysis in contrast to more complex models that may employ a higher number of features.As the framework is scalable, the computational load is reduced leading to less resource utilization.
In one instance, in the scenario of a server inundated with massive traffic, the legitimate connection attempts are erroneously flagged as malicious, leading to false positives.Another situation involves failure of proper filtering at the switch level, incorrectly categorizing malicious traffic as benign, that results in false negatives.Our framework tackles these false alarms through a multi-layered strategy.By combining feature selection, traffic clustering, anomaly prediction, and event correlation, the framework enhances the precision of attack identification while reducing false alarms.This multifaceted approach, applied across various network levels, indicates a comprehensive strategy aimed at diminishing false negatives by capturing diverse patterns of DDoS attacks.Additionally, the focus on consolidating data through event correlation highlights the need for evolving a method to minimize false positives by establishing a contextual understanding of network behavior.

Experimental evaluation
Python programming, the Scikit-learn library is used to evaluate the proposed RDAER system.To assess the proficiency and viability of our strategy in identifying malicious traffic, we executed the proposed approach on a best-inclass dataset.

Performance metrics
This paper addresses the issue of separating malicious traffic from legitimate traffic.When evaluating the performance of a detection model, one should consider taking several metrics into account as given below: where TP (True Positive) denotes the number of malicious samples the algorithm has found; TN (True The area under the ROC curve (AUC) is a measure of efficiency that considers all possible classification levels.DDoS attacks can be detected using a model with a high AUC value.

Dataset
The dataset we used to evaluate the proposed RDAER approach is CICDDoS2019 which the Canadian Institute gave for Cybersecurity [41].This dataset is the naive dataset with more modern attacking methods.Reflection and exploitation attacks are the most common types of attacks in the dataset.These attacks mask the intruder's identity by sending packets to servers from the target IP address, causing the target victim's bandwidth to become overburdened with response packets.The dataset is composed of 88 features.It provides 12 types of DDoS attacks, namely NTP, DNS, LDAP, NetBIOS, UDP, UDP-Lag, SSDP, SYN, TFTP, SNMP, MSSQL, and Web DDoS [42].Considering the experimental configuration's network infrastructure, the interval was t = 1 min.The following results are based solely on examining the dataset CICDDoS 2019.Overall, 500 traffic samples were employed for our experiments.The first 200 samples train the network's normal behavior, while the remaining 300 test it.

Detection performance
During model training, 86 features are given to the RFE method which has given ranking to all the features.The feature importance graph is shown below in Fig. 8.The graph shows that the source address and destination address secure the highest ranking which is above 0.08 than all features.The standout feature of RDAER is its focus on feature reduction.By utilizing only two key features for DDoS detection, it effectively minimizes resource utilization and detection time.Following the extraction of features, the number of clusters for the proposed RDAER was three, with the optimal values of eps and minpts as 0.08 and 7, respectively.Thus, we estimate the DBSCAN for homogeneity, completeness, the adjusted Rand index, mutual information, and the silhouette coefficient.Figure 9 shows the DBSCAN result.
The USIA feature of a traffic sample is processed for score 1 using ARIMA and chaos theory.The trained model is used to forecast the following value for ẑ t .Figure 10 depicts the original and predicted values for the ẑ t .The projected value differs from the actual value during the attack since the ARIMA model was (2024) 13:64 Fig. 8 Feature importance graph using regular traffic data.The error's chaotic behavior distinguishes attack samples from the usual traffic flow.Hence, the prediction error is estimated using the Lyapunov exponent.From Fig. 11, the negative Lyapunov exponent value determines the regular traffic, whereas the positive value determines attack traffic.We map negative and positive Lyapunov values to scores 0 and 1, respectively.
The exponential filter and threshold method uses the same traffic sample to calculate the anomaly score2 for the NUDA feature.The two exponential filters with different α values calculate the anomaly score 2 .The output of the filters and their difference in estimating the new feature is presented in Figs. 12 and 13.When a rolling median is applied to the difference with a window size of w = 5, a median is produced, which is used as the new feature to differentiate between regular traffic and attack traffic.Data's mean and standard deviation generate the threshold values.The median is larger during an attack than during a normal one as represented in Fig. 14.The median is high between 400 and 600 min, which indicates an attack is occurring throughout this period.To get the final anomaly score for one cluster, an ANDing of score 1 and score 2 was performed.The rule-based method correlates all the final anomaly scores obtained from each cluster and then determines whether or not the specimen is abnormal.As the clusters are correlated, the detection of attack traffic will be speedy.The result of the detection method was depicted in Fig. 15.

Comparison with state of the art detection methods
This section presents a comparative analysis of the proposed model with recent models against different techniques used in the RDAER model.We evaluate various aspects of the models and provide insights into their performance and effectiveness.
Table 4 shows comparison between the DBSCAN and other clustering techniques.Compared to alternative clustering techniques, it is obvious that the DBSCAN method performed the traffic data analysis more efficiently and with a very high degree of accuracy.The DBSCAN algorithm removes noise when evaluating Table 5 shows the comparison of the proposed RDAER with other time series techniques.These techniques establish baseline patterns of normal network behavior.When incoming data significantly deviates from these established patterns, indicating unusual spikes or irregularities, it may signal a potential DDoS attack.With the utilization of time series techniques for the traffic features (USIA and NUDIA), they indicated unusual spikes at a particular time period between 400 to 600 min range.This pattern showcases the malicious behavior associated with the network traffic and which could be a DDoS pattern.Utilizing two distinct techniques for analyzing the two features at a specific timestamp accelerates the   The RDAER model is also compared with the latest DDoS detection techniques against accuracy, precision, recall, and f1-score, as shown in Table 6.Another study [20] proposed a time series model and showed that the model they developed achieved an accuracy of 98.82% in detecting DDoS attacks.The author of the paper [49] proposed an extreme learning algorithm that detects DDoS attacks with an accuracy of 99.18% with the NSL-KDD dataset and 95.11% with the ISCX dataset.Another learning-based K-means and optimal fuzzy system model [50] achieved an accuracy of 96.54% in detecting intrusions.The RNN-based model [51] achieved an accuracy of 94.12% and a precision of 98.18% in detecting the attack.The paper [52] presented a model based on cognitive mechanisms termed artificial immune systems for detecting DDoS attacks in the cloud environment with the accuracy and precision levels of 96.56% and 95%, respectively.In [53], a deep neural network with an auto-encoder approach is proposed for detecting DDoS attacks and has achieved a good accuracy of 98.43%.Table 6 shows that while some frameworks utilize a single method and others employ combined techniques, none have integrated an event correlation technique in a distributed network.The standout feature of RDAER is its event correlation capability at the controller level.Utilizing a rule-based classifier, the RDAER framework conducts event correlation across various network anomalies, resulting in remarkably high accuracy and reduced detection time when identifying attack traffic.The graphical representation of comparison of RDAER with earlier models is depicted in Fig. 16.
Table 7 distinguishes the performance metrics between the RDAER and up-to-date models based on the CICD-DOS 2019 dataset.The given table presented year-wise models implemented on the DDoS2019 dataset.The proposed work outperformed existing models in terms of accuracy, and utilized only two features compared to other models, thereby reducing resource utilization and computational costs.This more efficient approach allowed the model to detect attacks early with high accuracy.The outcomes indicate that this defense technique is successful in preventing DDoS attacks.In the event of an attack, the controller promptly modifies flow table entry rules upon switch detection, effectively isolating harmful traffic and ensuring network security.
Table 8, and Fig. 17 presents the comparison of detection time between RDAER and other techniques.Authors in another study [59] used the IFS method for detecting DDoS attacks with a detection time of less than 300 s.The Logistic Regression technique [60] achieved a success rate of 99.8% and a detection time of 788 s.In [61], the author detected the attack in 40.78 s using a hybrid machine-learning technique.Another author in [41] witnessed the DDoS attack in minutes.Other models postulated previously [62] and [15], show a detection time of 320 s and 24 s, respectively, for predicting DDoS attacks.Compared to earlier models, the proposed model showed promising results with 99.92% accuracy and less detection time (by 20 s) using feature selection, traffic clustering, time series and event correlation techniques.Several factors contribute to the reduced detection time.Firstly, the use of only two specific features, their clustering, and parallel processing enable the initial prediction of anomalies at the switch level.Additionally, the consolidation of these findings through event correlation at the controller level significantly diminishes the time required for identifying DDoS patterns.Figure 18 shows the ROC curve, and the value of AUC achieved is 1.0.

Case study
In a cloud-based organization, the network infrastructure is tiered to accommodate various user levels, offering Software as a Service (SaaS) to its clients.This setup includes multiple routers and switches managed centrally by an SDN controller.As data moves through these switches, the flow tables store traffic patterns, coordinated by the controller.At this juncture, our RDAER system steps in to extract raw data and preprocess it, ensuring the removal of missing or null values.Subsequently, the preprocessed data undergo feature selection using the RFE algorithm, focusing on the most pertinent attributes.These selected features are then grouped based on timestamps using the DBSCAN algorithm and are scrutinized within clusters for any irregularities or signs indicative of a potential attack using time-series techniques.The culmination of this analysis is relayed to the controller, which employs event correlation techniques to discern and classify DDoS traffic from regular network activity.

Conclusion
This work introduces an RDAER model that integrates multiple techniques within the context of SDN to proactively identify and address DDoS attacks in different SDN-based cloud environments.This approach involves feature selection, clustering, time series analysis, and event correlation-based classification to enhance early detection of DDoS anomalies in network traffic.By examining each OpenFlow switch individually, RDAER's five-module structure enables effective data preprocessing, and selects key features USIA and NUDIA using RFE.These selected features are then grouped into clusters, considering their timestamps, to facilitate

•
Perform data preprocessing by using multiple machine learning methods which convert raw traffic data into normalized data to improve the accuracy of predictions and effective resource utilization.• For dynamic DDoS attack patterns, we use the Recursive Feature Elimination (RFE) method to select relevant features that accurately distinguish between benign and malicious data.• To handle a huge volume of traffic data and dynamic DDoS patterns we use Density-Based Spatial Clustering of Applications with Noise (DBSCAN) to form the clusters based on time, which helps early detection of DDoS attacks.• Furthermore, each cluster is analyzed using autoregressive integrated moving average (ARIMA), Lyapunov exponent, Exponential filters, and dynamic threshold to predict the chaotic behavior by calculating anomaly scores.

Fig. 4
Flow chart of DBSCAN clustering

Fig. 6 Algorithm 1
Fig. 6 Differencing graphs for selecting the ARIMA model

( 8 )
Accuracy(Acc) = TP + TN TP + TN + FP + FN F 1score = 2TP 2TP + FP + FN Negative) represents the number of benign samples to the normal ones; FP (False Positive) represents normal samples that are mistaken for malicious ones; FN (False Negative) denotes attack samples identified by false negatives.

Fig. 16
Fig. 16 Comparison of RDAER with earlier models comprehensive traffic analysis.Within each cluster, a range of techniques, including ARIMA, Lyapunov exponent, exponential smoothing, and dynamic threshold calculations, is applied to compute two scores: score1 and score2.At the controller level final scores are calculated and correlated using rule-based classifier to classify traffic as either DDoS or normal.A DDoS attack warning is also issued whenever a switch detects any instances of a DDoS attack, and the countermeasure module alters the flow table to block the attack.The proposed RDAER model achieved a high accuracy rate of 99.92% and a fast detection time of 20 s in detecting DDoS attacks.In the future, the RDAER training and testing data can be further tuned to improve accuracy.

Table 1
Comparison of related works

Table 2
Features selected

Table 3
Notation used in the work

Table 4
Comparative analysis of RDAER vs. other clustering techniques

Table 5
Comparative analysis of RDAER and other time series techniques

Table 6
Comparative analysis of RDAER vs. existing works

Table 7
Comparative results of RDAER vs latest schemes based on the CICDDoS2019 dataset

Table 8
Comparison of detection time between proposed and existing methods