Security strategy for autonomous vehicle cyber-physical systems using transfer learning

Cyber‑physical systems (CPSs) are emergent systems that enable effective real‑time communication and collabora‑ tion (C&C) of physical components such as control systems, sensors, actuators, and the surrounding environment through a cyber communication infrastructure. As such, autonomous vehicles (AVs) are one of the fields that have significantly adopted the CPS approach to improving people’s lives in smart cities by reducing energy consump‑ tion and air pollution. Therefore, autonomous vehicle‑cyber physical systems (AV‑CPSs) have attracted enormous investments from major corporations and are projected to be widely used. However, AV‑CPS is vulnerable to cyber and physical threat vectors due to the deep integration of information technology (IT), including cloud computing, with the communication process. Cloud computing is critical in providing the scalable infrastructure required for real‑ time data processing, storage, and analysis in AV‑CPS, allowing these systems to work seamlessly in smart cities. CPS components such as sensors and control systems through network infrastructure are particularly vulnerable to cyber‑ attacks targeted by attackers using the communication system. This paper proposes an intelligent intrusion detection system (IIDS) for AV‑CPS using transfer learning to identify cyberattacks launched against connected physical com‑ ponents of AVs through a network infrastructure. First, AV‑CPS was developed by implementing the controller area network (CAN) and integrating it into the AV simulation model. Second, the dataset was generated from the AV‑CPS. The collected dataset was then preprocessed to be trained and tested via pre‑trained CNNs. Third, eight pre‑trained networks were implemented, namely, InceptionV3, ResNet‑50, ShuffleNet, MobileNetV2, GoogLeNet, ResNet‑18, SqueezeNet, and AlexNet. The performance of the implemented models was evaluated. According to the experimen‑ tal evaluation results, GoogLeNet outperformed all other pre‑rained networks, scoring an F1‑ score of 99.47%.


Introduction
Autonomous Vehicles (AVs) have recently been developing rapidly, and some smart or self-driving cars can be found on public roads [1].Furthermore, AVs have been a trending topic in academia and business as many people have started understanding the unlimited benefits of this technology [2].AVs can execute sophisticated tasks like lane departure warnings, traffic sign identification, and avoiding collisions, and they can also reduce the workload of human drivers [3].Furthermore, the operation of AVs positively impacts the environment by reducing energy consumption and air pollution [4].AVs typically have sophisticated computing, sensing, and actuating systems based on heterogeneous architectural designs.However, numerous inherent difficulties exist with networking and communication technologies, including security, privacy, data transmission, real-time data analytics, and bandwidth restrictions [5].In response to technological advancements in smart transportation systems, AVs may transmit data via various communication protocols.A cyber-physical system (CPS) is a system that uses contemporary sensor, computation, and network technology to merge cyber and physical components seamlessly [6].AV provides the perfect fit model of a CPS when integrated with electronic and physical equipment [7].Several vehicle network systems, like sensors, actuators, and electronic control units (ECUs), can connect AV components.The most popular network systems are controller area networks (CAN), Time-Triggered CAN (TTCAN), Local Interconnect Networks (LIN), and FlexRay [8].CAN protocol was invented by Bosch researchers in 1985 to replace the wires in the automotive system because the number of wires started to increase, resulting in a degradation in reliability [9].CAN is a standard communications protocol that can be used for vehicle control sensor data [10].However, the CAN can process a small number of real-time sensor data [10].
Additionally, the data flow from numerous automotive core control systems, including the engine, transmission system, body system, and other electrical equipment, is collected via the CAN bus, and every bit of information is broadcast to the CAN bus.Every node has perpetual access to the network, which states that harmful internal or external information might attack any CAN network node in a vehicle [11].Understandably, the AVs must be equipped with more sensing and communication equipment to function independently. Figure 1 shows essential AV components.However, when the degree of autonomy increases, so do the security dangers [12].Attack sources are often negative internal components or external events intended to undermine the AV's expected autonomy [11].The attack surfaces of AVs can be Airbag ECU, USB, Bluetooth, Vehicle access system ECU, etc., as identified in [13].
In the context of smart cities, AVs are a key application of CPS principles, helping to improve urban living by decreasing energy consumption and minimizing air pollution.Integrating information technology (IT) and communication processes in AV-CPS, including cloud computing, is crucial in providing the scalable infrastructure required for real-time data processing, storage, and analysis in AV-CPS, allowing these systems to operate seamlessly in smart cities.Furthermore, AVs are susceptible to cyberattacks such as key fob cloning attacks, radar attacks, telematics service attacks, sensors spoofing attacks, ultrasonic sensor attacks, lidar sensor attacks, camera sensor attacks, and others, also new risks like ransomware and vehicle theft [14].Therefore, this paper proposes a method based on a pre-trained convolutional neural network (CNN), which helps to detect cyberattacks conducted through the CAN communication protocol to target the connected physical components of AVs.With the power of transfer learning, it is possible to use pre-trained models with different types of systems.Therefore, transfer is an important concept in deep learning, and it arises from insufficient data and starting training from scratch.Transfer learning transforms learning from a pre-trained model to a new related model.So, it is a learner trained on data from different domains because sometimes it is difficult and expensive to train data with the traditional machine learning technique, which assumes training data in the same domain [15].

Our contribution
The following is a summary of this paper's contributions: • We implemented the CAN communication protocol using the CAN communication toolbox in Simulink [16] and integrated it into an AV simulation model developed by MathWorks, Inc. [17].This allows connected physical devices such as sensors, the adaptive cruise control (ACC) system, and actuators to interact and collaborate, called autonomous vehicle cyber-physical systems (AV-CPS).• We generate the dataset from the AV-CPS and preprocess it by converting signals into images to be fed to the pre-trained CNNs.• We implemented an intelligent intrusion detection system (IIDS) using eight pre-trained networks and performed each network's performance analysis.Our experiment found that GoogLeNet performed best because it recorded 99.47% based on the F1-score parameter.

Paper structure
The rest of this paper is structured as follows: Sect. 2 contains the literature review, which discusses the most recent works and the research gaps in the security of autonomous vehicles.Then, Sect. 3 presents the research methodology, including the implementation of the autonomous vehicle cyber-physical system (AV-CPS), and discusses the process of collecting preprocess of the dataset, followed by the findings and discussion in Sect. 4. Finally, the conclusions and remarks are drawn in Sect. 5.

Related research
The continued advancement in artificial intelligence, communication, and remote sensing have significantly improved the development cycle of smart cities and their applications and services.Several smart services have surfaced and industrialized recently to enhance the standard of living in smart cities, covering diverse sectors such as communication, cybersecurity, smart grids, healthcare, and transportation systems [18][19][20].Much of the efforts were directed toward developing smart mobility and intelligent transportation systems such as autonomous vehicles [21].Over 250 million automobiles will be connected to roadside units in a few years [22].Despite this enormous growth of the autonomous vehicles industry, they are susceptible to a broad scale of cyberattacks with impacts ranging from minor control commands to strict control that threaten individuals' lives and wellbeing.Hence, various studies and systems have been suggested to examine, recognize, and alleviate the cyberattacks and threats against autonomous vehicle systems.The majority of the conducted studies were developed by coupling the different machine learning (ML) methods with cybersecurity practices to build security and defense systems.
In [23], an intelligent intrusion detection mechanism to secure external communications for autonomous vehicles was proposed and developed.The detection mechanism is based on a hybrid intelligent intrusion system that combines multi-layer perceptron (MLPs) with the overlapping proportional scores (POS) technique [24] and fuzzy sets to recognize the actions of connected, communicating autonomous vehicles.Specifically, their hybrid IDS utilizes the backpropagation neural networks to identify Denial of Service (DoS) attacks.Their experimental evaluation showed that their proposed detection showed high detection rates for DoS attacks in autonomous vehicles.However, their model has high inferencing overhead due to the computational processing through diverse subsystems such as the preprocessing subsystem, feature extortion subsystem with POS module, fuzzification subsystem [25] to reduce the features of data, MLP training subsystem, and finally, the detection subsystem (identify the traffic to either normal or anomaly).
In [26], the authors presented and implemented a detection method for false data injection in autonomous vehicles.Their system is composed of three subsystems.First, the false data injection (FDI) [27] subsystem, in which they inject simulated attacks into an autonomous vehicle.Second, the cyberattack dataset collection (CDC) subsystem generates and collects the dataset from a simulation model under two modes of operation (normal mode and attack mode).Third, the intrusion detection mechanism (IDM) subsystem employ the long short-term memory (LSTM) deep networks [28] to detect cyberattack, which is FDI attacks targeting the control system of autonomous vehicles.Their IDS system categorizes the data sample as normal and abnormal.Their investigational evaluation proved the superiority of their model, scoring high detection rates outperforming other state-of-the-art models.However, their proposed system used only a simulated dataset without implementing the control communication system over autonomous vehicles.
In [29], the authors described a machine learningbased intrusion detection system (ML-IDS) developed mainly to minimize the risk of traffic disturbance and collisions triggered by the presence of cyber-attacks.Their proposed IDS can identify two types of cyberattacks commonly deliberated against connected autonomous vehicles: spoofing attacks and jamming attacks.The IDS mechanism was developed using four different learning techniques, including Random Forest (RF), k-nearest Neighbors (k-NN), One-Class Support Vector Machine (OCSVM), and Data Fusion (DF) in a crosslayer approach.Their experimental assessment showed that the based DF technique obtained the best performance factors, scoring more than 90% accuracy using training datasets with known and unknown attacks.However, the detection accuracy obtained by the author needs to be more competent, as it has been overcome by several other intelligent models, such as [30].In the same context, the authors of [31,32] proposed another ML-IDS to detect malicious traffic in the CAN bus network.Their proposed ML-IDS system employs a naïve Bayes algorithm and random decision trees to detect three types of cyber-attacks: impersonation attacks, DoS attacks, and fuzzy attacks.According to their experimental assessment, the authors conclude that their proposed ML-IDS is scalable and adaptable to various new attacks on autonomous vehicles.However, their model utilized imbalanced classes of the used dataset, which may affect the results attained for the performance evaluation indicators.
In [33], another machine learning-based intrusion detection system (ML-IDS) for autonomous vehicle communication is established on the realistic network dataset called the ToN-IoT dataset [34].To reduce the number of features, the Chi-square (Chi2) method is used for feature selection [35].Also, to avoid outweighing features with higher values over features with lower values, data normalization normalizes all numerical data records in the dataset.Moreover, since the ToN-IoT dataset is imbalanced, the Synthetic minority oversampling technique (SMOTE) [36] was used for class balancing.Besides, the model has been configured to provide binary classification (normal vs. anomaly) and multiclassification (normal, password attack, scanning, distributed denial of service (DDoS), data injection, backdoor, Cross-site Scripting (XSS), denial of service (DoS), and man-in-the-middle (MITM), and ransomware).To this end, the contributors have characterized the performance of eight supervised machine learning techniques: naive Bayes (NB), decision tree (DT), logistic regression (LR), support vector machine (SVM), k-nearest neighbor (kNN), random forest (RF), Ada-Boost, and XGBoost techniques.Based on the simulation results, the XGBoost method outperformed other ML methods.However, their best model could perform better on sparse and unstructured data (such as outliers).
In [37], an automated cloud-based intrusion detection framework with continuous service availability for autonomous vehicles is developed and suggested to provide defense services at users' quality of service (QoS) and quality of experience (QoE) requirements.The proposed technique is developed by clustering autonomous vehicles into service-specific clusters, each with a cluster head to communicate with a trusted third party (TTP).This mediates the service between requesters and providers.Also, the proposed IDS comprises three modules, including traffic analysis, reduction, and classification (identifying good and bad requests).Specifically, their detection model employed the deep belief networks (DPN) for traffic reduction and decision tree techniques for classification purposes.Their validation process demonstrated the usefulness of their IDS, scoring high positive (around 99%) and low false detection rates (less than 1.6%).
In [38], a cost-effective, lightweight intrusion detection scheme was built to identify abnormalities in the autonomous vehicular system.The model design aimed to mitigate the risks of cyberattacks against the Internet of Vehicles (IoV) [39], which might cause fatal errors in vehicle control systems such as braking control, steering control, air conditioning control, and locking control elements.To do so, they suggested a model based on a three-layer neural network with a Softmax classifier to provide a binary detection of the input data records into either normal or abnormal or multi-classification into normal, reconnaissance attack, denial of service (DoS) attack, and fuzzing attack.They evaluate their IDS model on a simulated In-Vehicle Network (IVN) dataset originally generated from the control area network (CAN).They reported several evaluation outcomes, including detection accuracy, recall, precision, and F-1 score.Based on their comparative analysis, they concluded that their proposed scheme surpasses other classification schemes.However, their shallow model must provide accurate detection and classification for large-scale attack vectors developed with minor mutation of previously developed attacks [40].
In [41], the authors focused on identifying malicious cyber-attacks targeting the electronic control units (ECUs) composing the autonomous vehicular system and connected via in-vehicle networks (controller area network-CAN).To that end, they proposed a hybrid anomaly-based intrusion detection system (HAIDS) using rule-based and supervised learning-based methods.Their model aims to attain a high identification ratio at minimal computational complexity.Their experimental setup comprised the collected CAN traffic from four different simulated vehicles.The rule-based part of the model was developed using three main rules, including Valid vehicle ID, Time Interval for message communication in CAN, and the valid data length code, which specifies the length of the messages in bytes.The machine learning part of the model was implemented as a three-layer neural network.As a result, they concluded that their hybrid IDS is effective and efficient in detecting anomalous over CAN communications.
Similarly, another hybrid anomaly-based intrusion detection system (HAIDS) employing rule-based IDS and machine learning IDS is suggested in [42].In addition to the rule-based model, this system has characterized the performance of the supervised learning methods, viz., decision tree, random forest, and XGboost.As a result, they reported that their system could detect malicious CAN traffic with an accuracy of more than 90%.However, these proposed systems can not detect attacks that drop aperiodic messages that leverage the periodicity of CAN messages, which is still an open research problem.
Another noticeable research direction that has recently emerged is the use of blockchain technology to provide detection for several cyber-and physical attacks on the Internet of Things (IoT) in general [43] and in autonomous vehicles in specific [44,45].For instance, researchers in [44] focused on analyzing privacy and trust issues in autonomous transportation systems (ATS).They mainly investigated the cyberattacks on several communication techniques employed in autonomous vehicles, such as WLANs (Wireless Local Area Networks), WPANs (Wireless Personal Area Networks), WSNs (Wireless Sensor Networks), and RFID (Radio Frequency Identification).Therefore, they proposed a new IDS system (PChain) based on Blockchain Technology.PChain IDS protects IoT and ATS/ Vehicular Edge Computing (mainly autonomous vehicles).As a result, the model evaluation revealed that the detection accuracy rate changes with data size, which peaked at 95.5% for 10,000 data sizes and 40 training epochs.A similar tendency for precision and recall factors, which scored 94% and 95% for precision and recall, peaked at 10,000 data sizes and 40 training epochs.However, the researchers in this research have evaluated their model's performance on the KDDCup99 dataset [46], which comprises old common cyber-attacks that might be outdated in the current era.In the same context, the authors in [45] proposed a cooperative IDS system for autonomous vehicular networks tailored to the increased attack surface from common cyber-attacks.The proposed IDS system uses a decentralized federated-based methodology to decrease the resource utilization overhead of the main server by distributing the training load into distributed edge devices.To guarantee the security of the accumulation mechanism, blockchain is employed for storing and sharing the training loads.However, blockchain technology is vulnerable to cyber-attacks, such as ransomware attacks [47].
Furthermore, authors in [61] investigated anticipating and adjusting to users' interests and preferences in location-based social networks using copious amounts of user check-in data from Internet of Things (IoT) devices like cell phones.One important component of the Augmented Intelligence of Things (IoT) is the recommendation of successive points of interest (POIs), which is examined.Limitations of existing methodologies are emphasized and examined, including recurrent neural network-based approaches and graph neural networkbased methods.The abstract suggests an Interactionenhanced and Time-aware Graph Convolution Network (ITGCN) for successive POI suggestions to overcome these drawbacks.ITGCN integrates a self-attention aggregator to embed high-order connectivity into node representation and an enhanced graph convolution network for dynamic representation learning.The system aims to help corporate management anticipate user preferences so that better development and planning may occur.According to their experimental results, ITGCN performs better regarding suggestion accuracy than current approaches.To sum up, the following table, Table 1, provides a very short summary of the key points in the related works.
In [62], the authors addressed how the growing number of Connected and Autonomous Vehicles (CAVs) exposes Internet of Vehicles (IoV) environments to cyberattacks.The study suggests an Intelligent Intrusion Detection System (IIDS) that uses a modified Convolutional Neural Network (CNN) with hyperparameter tuning to address this issue.In a 5G Vehicle-to-Everything (V2X) setting, the IIDS efficiently identifies and classifies malevolent Autonomous Vehicles (AVs), assisting in traffic safety monitoring and collision avoidance.According to experimental results, attacks can be detected with 98% accuracy.In the same context, the authors of [63] investigated the increasing risk of cyberattacks on in-car In another noticeable research, in [64], the researcher emphasized the necessity of an efficient Intrusion Detection System (IDS) by highlighting the vulnerability of the Controller Area Network (CAN) bus in automobiles due to its lack of security measures.Vehicle networks with limited known attack signatures might not fit traditional internet-based intrusion detection systems well.The suggested remedy is a new intrusion detection system (IDS) model dubbed GIDS (GAN-based Intrusion Detection System) that uses Generative Adversarial Nets, a deep learning model.With just regular data, GIDS can learn and identify unknown assaults.Experimental findings address the problem of guaranteeing safety in in-vehicle networks by demonstrating GIDS's excellent detection accuracy for four unknown assaults.In [65], the authors have discussed the limitations of recurrent neural networks (RNN) in developing efficient intrusion detection systems for in-vehicle networks are discussed in this abstract.These limitations originate from RNNs' intricate structure and high processing costs.Additionally, it mentions that temporal linkages are difficult for convolutional neural networks (CNN) to capture and that important regions have insufficient feature representation.To solve these problems, the research suggests a novel model named TCAN-IDS (Temporal Convolutional Network with Global Attention).Real-time monitoring is made possible by TCAN-IDS, which encodes 19-bit characteristics of data fields and arbitration bits into a message matrix.By concentrating on important areas, the global attention mechanism enhances feature extraction.TCAN-IDS has demonstrated strong detection performance on attack datasets that are known to exist, according to experimental results.This means that it can be used for real-time monitoring and to maintain the delicate balance between information security and preventing unauthorized intrusions.Furthermore, in [66], the authors highlighted the difficulties caused by various attack techniques while underlining the significance of intrusion detection for vehicle communications security.The limits of previous approaches, which consider local characteristics or map multi-features poorly, are critiqued.Spatial and temporal interactions are simultaneously encoded by the encoding-detection architecture of the proposed model, STC-IDS (Spatial-Temporal Correlation Intrusion Detection System).Attention-based convolutional networks and attention-long short-term memory produce strong spatial-temporal attention features for anomaly classification.Both single-and multi-frame architectures have specific benefits.Empirical investigations on real-world vehicle assault datasets show that STC-IDS outperforms baseline approaches with automatic hyperparameter selection based on Bayesian optimization, delivering lower false-alarm rates while maintaining efficiency.While the above-stated systems are established and focused on developing Attack-Aware systems [48] utilizing different computational intelligence schemes such as fuzzy inference systems (FISs), neural network systems (NNS), and machine learning systems (MLS) over predefined accumulated datasets that are composed using the most conventional attributes for the common cyberattacks of communication networks on the IoT, CPS, and others, nevertheless, these systems do not support the identification of new real-time cyberattacks targeting the autonomous vehicle-cyber physical system's controller area network (CAN) communication system.In this research, we improve the security of autonomous vehicle control by developing a high-performance intelligent intrusion detection system (IIDS) using non-traditional machine learning techniques.
Particularly, we first develop our autonomous vehicle cyber-physical systems (AV-CPS) by implementing the controller area network (CAN) communication protocol and then integrating it into an autonomous vehicle simulation model.This, in turn, enables the other connected physical devices (e.g., sensors and actuators) to communicate and collaborate (C&C) successfully.Second, we produce and collect our novel dataset from launching AV-CPS in normal and attack modes.The generated dataset comprising of normal data samples accumulated from the normal operational mode of AV-CPS and false data samples injected by the attacker then undergoes consecutive preprocessing operations to end up with all samples transformed into unitized images that the deep neural networks can process.Finally, we implement an intelligent intrusion detection system (IIDS) using the transfer learning process from pre-trained deep convolutional networks (DCNNs).Specifically, we characterized the performance of eight DCNNs, including InceptionV3 CNN, ResNet-50 CNN, ShuffleNet CNN, MobileNetV2 CNN, GoogLeNet CNN, ResNet-18 CNN, SqueezeNet CNN, and AlexNet CNN.Our simulation results demonstrated high detection performance for our IIDS-based GoogLeNet model after we validated it with other implemented DCNNs and existing models in the same study area.

Methodology
This section explains the AV simulation scheme used in this research.Also, it represents the process of using the CPS concept with the selected AV simulation.This is accomplished by implementing and integrating communication network nodes into the AV simulation, and we named the new model the autonomous vehicle cyberphysical system (AV-CPS).Additionally, this section explains the procedure for generating the dataset from the AV-CPS simulation model.Finally, it shows the process of using transfer learning.

AV Simulation scheme
A simulation is a software-based model that can be used to study and evaluate a model's performance before it reaches the production stage [49].There are several benefits of using a simulation model compared with an actual model, such as less cost-effectiveness and simplicity in implementation, testing, and maintenance [50].The simulation model used in this research is a self-driving car system consisting of a lead vehicle and an ego vehicle (self-driving car).In ideal conditions, the ego vehicle should maintain its distance from the lead vehicle using the ACC.Therefore, the ego vehicle should keep track of the position of the lead vehicle.This study focuses on the ego vehicle, which consists of three essential components: ACC, position sensor, and velocity sensor, as shown in Fig. 2. The position sensor senses the position of the lead vehicle and ego vehicles' positions.Furthermore, the velocity sensor records the velocity of the lead and ego vehicles.The sensor measurements are sent to the ACC to adjust the speed of the ego vehicle when following the lead vehicle.

CAN communication network
MathWorks AV simulation does not include a communication system component; therefore, to apply the CPS concept, we implemented and integrated a communication system based on CAN protocol using the vehicle network toolbox built on Simulink [16].The vehicle network toolbox contains techniques and tools to design, analyze, architect, simulate, and validate communications systems.The toolbox has several components: CAN communication, CAN Flexible Data (FD) communication, J1939 communication, and Universal Measurement and Calibration Protocol (XCP) communication.This research concentrates on using the CAN communication component to build the communication system of the AV-CPS.Table 2 lists the CAN block used, and Fig. 3 shows the sending and receiving messages of the CAN communication subsystem.Initially, the signal is packed to the CAN message stander using the CAN pack; then, the CAN message is transmitted to the assigned CAN device using the CAN transmit component.The CAN configuration configures particular CAN devices' parameters for sending and receiving CAN messages.Finally, the CAN message is received from the particular CAN device, and the CAN unpack is used to unpack the CAN message to signals.

Autonomous vehicle cyber-physical system
The AV-CPS architecture implemented in this research comprises the following subsystems: sensors, two CAN communication nodes, a controller, and actuators.Nodes A and B receive and send signals between the AV subsystem, as shown in Fig. 4. Nodes A and B are integrated to simulate the activity of the CAN communication within the AV-CPS.Therefore, node A's functionality is to receive and send the following signals: • The actual location of the ego vehicle (from the sensor).
• The actual position of the lead vehicle (from the sensor).
• The actual speed of the ego vehicle (from the sensor).
• The actual speed of the lead vehicle (from the sensor).
• The time gap (constant value).
• The desired speed (constant value).
The actual position of the ego vehicle is measured in meters, and the velocity is measured in m/s.The time gap between the lead vehicle and the ego vehicle is equal to 1.4 s.The desired speed is equal to 30 m/s.After the signals are transmitted to the ACC, it is responsible for producing a control signal to adjust the ego vehicle's speed concerning the desired speed and the position of the lead vehicle.Finally, the ACC output is received and sent to the actuators by node B. The actuators convert the signal to mechanical movement to accelerate or reduce the speed.We showed a single closed-loop operation of the simulation, and those steps are repeated until the end, which runs for 81 s.

Generating dataset
As shown in Fig. 5, it is assumed that a threat actor compromised node A. Thus, the threat actor inserted inaccurate data into the position sensor of the ego vehicle through node A. Therefore, the ACC received fault data about the position of the ego vehicle, and as a result, the ACC produced imperfect control signals.Figure 4 and Fig. 5 represent two different scenarios.Figure 4 shows the normal status when there is no cyberattack, whereas Fig. 5 illustrates the anomaly status when there is a cyberattack.More information about the implementation of the cyberattack was presented in our previous research [26].The raw dataset is 1-dimensional, and its size is 80,000.The dataset is balanced; therefore, 40,000 is normal, and the rest is attack data.The dataset consists of four features: (1) the actual position, (2) the actual velocity of the ego vehicle, (3) the actual position, and (4) the actual velocity of the lead vehicle.Table 3 lists the raw dataset along with its unit.
To prepare the pre-trained neural network input, we must transform the numerical 1-dimensional data into 2-dimensional data (images).Figure 6 illustrates the steps undertaken to convert signals into images.In step 1, the AV-CPS simulation runs.Step 2 records the response of the features as mentioned earlier as numerical 1-dimensional data in a matrix.Step 3 then reshapes this stored data from a 1D matrix to a 2D matrix.Step 4 involves saving the 2D matrix as an image; each image has a size of 4 × 81.We have the size of 4 × 81 because we have four features, and the simulation runs for 81 s each time.Finally, step 5 stores the resulting normal and anomaly images in separate folders.
Algorithm 1 illustrates how the 1-dimensional data was converted to 2-dimensional data (image).Initially, 'im_counter' is set to 0 to name the image.The 'start_row' and 'last_row' variables are used as counters because we aim to combine all four dataset rows.Subsequently, the normal and attack data are loaded for use.The 'normal' directory contains images with normal data, and the 'attack' directory is created to hold attack images.The while loop starts with one and continues up to the length of the data source, either normal or attack.Consequently, 20 thousand images will be created: 10 thousand labeled as normal and 10 thousand labeled as attack, and each image will be created in jpg format.Figure 8 illustrates the process of the proposed IIDS.After the data is received from the CAN protocol, it will be sent to the IIDS for scanning the traffic.The 1-dimensional data is converted to an image and transmitted to the pre-trained CNN model.The CNN is responsible for detecting whether there is an attack or not.

Transfer learning
Transfer learning is an important concept in deep learning, and it arises from insufficient data and starting training from scratch [15].The proposed work utilizes transfer learning by leveraging pre-trained models to enhance the IIDS performance for AV-CPSs.Transfer learning involves using knowledge gained from solving one problem (in this case, pre-trained models trained on large-scale datasets) to improve the performance on a different but related problem, detecting cyberattacks on AV-CPSs as depicted in Figs. 9 and 10 which illustrate the architecture of the pre-trained models used in this research named inceptionV3, resNet-50, shuffleNet, mobileNetV2, GoogLeNet, ResNet-18, squeezeNet, and alexNet.The Relu layer activates to omit any negative values from the images and replace them with zeros.A Pooling layer serves as a filter to reduce the size of the input image.Therefore, Relu and Pooling layers reduce  the computation costs.The Fully connected layer is used by all pre-trained models except squeezeNet, which uses a Convolutional 2D layer, and they help to classify images correctly [51].Table 4 summarizes each model ordered by the number of layers [52].The final layer of each model was altered to serve our purpose.For example, in inceptionV3, we replaced the fully connected layer and the classification layer to produce only two outputs (normal, anomaly).
This research used eight pre-trained CNNs; therefore, the generated dataset was trained, tested, and validated with each network.

Results and discussion
This section discusses the experimental setup, such as the software and hardware used to perform this research's experiments.Also, the results and discussion of the research findings were investigated.

Experiment
The experiments were performed using MATLAB and Simulink.MATLAB is a high-level programming language platform, and Simulink is a MATLAB modelbased design platform [53].The AV simulation model was developed by MathWorks using MATLAB and Simulink.However, this research implemented the CAN component protocol using the Simulink network toolbox, which was then integrated into the AV simulation.We used MATLAB to implement and test the pretrained CNNs.We conducted the training and testing   procedure using a computer system with a Graphics processing unit (GPU) to reduce the computation time and increase the experiment's performance.Table 5 summarizes the software and hardware used in this research.
Figure 11 summarizes the overall steps of our experiments used in this research.In the beginning, the dataset was imported to the MATLAB namespace, and as mentioned, the dataset was stored in two folders: normal and anomaly.Then, the dataset was preprocessed.Therefore, all images were resized to fit the appropriate input size of each pre-trained network.Each pretrained model expected two class outcomes, anomaly and normal.The dataset was split into two parts: 70% of the images were used for training and 30% for testing and validation.The data size was 20,000 images; 10,000 was normal, and 10,000 was anomalous.A fivefold cross-validation technique was used to ensure the validity of our training process.Finally, the expected output of each model is either normal or anomalous.Normal means there is no attack and anomalous means there is an attack.

Result and evaluation
The detection outcomes of the eight pre-trained models used in this research were analyzed based on precision, recall, f1-score, and accuracy classification as follows [26]: (1) Precision = TP/(TP + FP)x100 (2) Recall = TP/(TP + FN )x100  TP is a true positive that calculates the number of normal images classified correctly.FP is a false positive that calculates the number of normal images classified incorrectly.FN refers to a false negative that calculates the number of abnormal images classified incorrectly.Finally, TN is a true negative that calculates the number of abnormal images classified correctly [54].Table 6 represents the calculation accuracy of the pre-trained models based on four metrics of accuracy analysis.While all pre-trained models have approximately comparable performance results, GoogLeNet has an outperformed performance compared with other pre-trained models.The precision parameter has an identical value of 100.00% because every pre-trained model could classify every "normal image" as normal.However, the value of the recall parameter range from 98.94% to 98.59% due to several "anomaly image" being classified as normal.F1 score depends on parameters precision and recall, as shown in Eq. 3, and accuracy classification was calculated using Eq. 4.
To validate our work, we compared the performance outcomes of this research with recent existent IDS methods used with AV systems.Overall, pre-trained CNNs such as research 1,6 and 7 scored better than others, such as ANN and Bayesian networks, as listed in Table 7.

Conclusions and remarks
In conclusion, this research proposed an intelligent intrusion detection system (IIDS) to detect cyberattacks targeting physical components of an AV through controller area network CAN.Firstly, the CAN was implemented and integrated into an AV simulation developed by MathWorks to apply the CPS concept.Therefore, the new system is called an autonomous vehicle cyber-physical system (AV-CPS).Secondly, the dataset was generated from the AV-CPS and preprocessed by converting signals into images to be fed to pre-trained CNNs.Thirdly, eight pre-trained networks were implemented: InceptionV3, ResNet-50, ShuffleNet, MobileNetV2, GoogLeNet, ResNet-18, SqueezeNet, and AlexNet, and the performance analysis of each network was discussed.Our experiment found that GoogLeNet performed best because it recorded 99.47% based on the F1-score parameter.This research's resilient security concept can be used and applied to any CPS framework because each component of the system (AV-CPS) was built based on block architecture.Therefore, each subsystem, i.e., controller, sensors, actuators, and communication nodes, is independent and can be a breakdown.For feature work, we recommend applying the architecture of our system (AV-CPS) with different CPS domains, such as smart grids and drones.

Limitations
Our work primarily focuses on developing an IIDS by employing a transfer learning approach, specifically using pre-trained neural networks.The IIDS developed in this research, which is expected to collaborate with the ACC, aims to detect the network traffic locally (within the AV) emanating from sensors assumed to be compromised by attackers.Therefore, the IIDS is designed to scan network traffic transferred by the CAN model that is entering the ACC model.The anomaly detection technique conducted by the IIDS is confined to the data read by sensors; however, cyberattacks may be executed through various other components, a topic warranting future research.Additionally, researchers might consider designing an intelligent intrusion prevention system (IIPS) to mitigate the effects of cyberattacks.

Algorithm 1
Convert 1-D data to a 2-D image

Figure 7
Figure7highlights the process of preparing images for each pre-trained neural network model used in this research.In the beginning, the images are imported into MATLAB.The original size of the images is 4 × 81, but to ensure compatibility with each model, each image is resized to match the input size of the specific model.This resizing is performed by MATLAB using the augmented-ImageDatastore function.Figure8illustrates the process of the proposed IIDS.After the data is received from the CAN protocol, it will be sent to the IIDS for scanning the traffic.The 1-dimensional data is converted to an image and transmitted to the pre-trained CNN model.The CNN is responsible for detecting whether there is an attack or not.

Fig. 11
Fig. 11 Overall steps of the experiment

Table 1
A very short summary of the key points in the related works

Table 2
CAN communication component CAN Pack Signals should be packed within a CAN message CAN Transmit Send a CAN message to a particular CAN device CAN Configuration Configure parameters of particular CAN devices for sending/receiving messages CAN Receive Receive CAN messages from the particular CAN device CAN Unpack Unpacked CAN message to signals Fig. 3 CAN send and receive messages

Table 4
Summary of pre-trained CNN models used in this research

Table 5
Summary of components used to perform the experiments

Table 6
Accuracy performance outcomes