Advances, Systems and Applications
# | Objectives | Methodology/Algorithms | Experiments | Findings | Applications | Limitations |
---|---|---|---|---|---|---|
[15] | To develop a scheduling strategy for container-based apps in Smart City deployments that is network-aware. | As an addition to the default scheduling mechanism built into Kubernetes, a network-aware scheduling method is suggested and put into practice. | Evaluated utilizing container-based Smart City applications and validated on the Kubernetes platform. | Comparing the suggested method to the default scheduling mechanism, network latency is reduced by 80%. | Can be used in Fog Computing environments for delay-sensitive and data-intensive services. | Further testing and implementation may reveal limitations and future improvements. |
[16] | Public cloud container scheduling with consideration for cost. | A cluster scheduler with a focus on organizing batch job execution on virtual clusters, which is termed as Stratus. | On the basis of cluster workload traces from Google and TwoSigma, simulation experiments were conducted. | Stratus reduces virtual cluster scheduling costs by 17–44% compared to state-of-the-art approaches. | Batch job execution on virtual clusters in the public cloud | Limited to the context of batch job execution on virtual clusters in the public cloud |
[17] | To improve functionality and ensure user equality in a shared cluster with swappable hardware resources for deep learning frameworks. | min-cost bipartite matching | Large-scale simulations and evaluations on a small-scale CPU-GPU hybrid cluster. | AlloX can drastically shorten the average work completion time, eliminate starvation, and ensure fairness. | Scheduling jobs over interchangeable resources in a shared cluster | Interchangeable resources exceeding the threshold of two may cause many problems. |
[18] | Containers’ initial placement is optimized via task packing, enabling cluster size adjustment to meet changing workloads through autoscaling algorithms, and developing a rescheduling mechanism to shut down underutilized VM instances for cost savings while preserving task progress. | Heterogeneous job configurations Autoscaling algorithms Rescheduling mechanism | Validated using the Australian National Cloud Infrastructure (Nectar). | When compared to the standard Kubernetes framework, the suggested solution could lower overall costs for various cloud workload patterns by 23% to 32%. | Container orchestration with low costs on cloud computing infrastructures powered by Kubernetes. | VM types may also be taken into consideration. |
[19] | To develop a GPU-aware resource orchestration layer for datacenters To improve resource utilization and reduce operational costs in datacenters. To improve Quality of Service (QoS) for user-facing queries | Presented Kube-Knots, a Kubernetes-integrated resource orchestration layer that is GPU-aware. Kube-Knots uses dynamic container orchestration to dynamically harvest available computation cycles. Two GPU-based scheduling methods (CBP and PP) are created to schedule workloads at datacenter scale using Kube-Knots. | Evaluated CBP and PP on a ten node GPU cluster Compared results with state-of-the-art schedulers | For HPC applications, CBP and PP increase GPU usage across the cluster by up to 80% on average. Deep learning workloads' average job completion increased by up to 36% and 33% cluster-wide energy reduction For latency-critical queries, PP ensures end-to-end QoS by lowering QoS breaches by up to 53%. | To improve resource utilization and reduce operational costs in GPU-based datacenters | – |
[20] | To improve efficiency of data centers through holistic scheduling in Kubernetes To consider virtual and physical infrastructures and business processes in scheduling | Replaced the Kubernetes default scheduler with a proposed all-encompassing scheduling framework. Both software and hardware model considerations are made by the scheduler. System was deployed in a real data center. | Deployment in real data center | Reductions in power consumption of 10% to 20% were noted The effectiveness of the data center can be significantly increased by an intelligent scheduler. | To improve efficiency of data centers through software-based solutions | Further research is needed in this area |
[21] | A new Kubernetes container scheduling strategy (KCSS) has been introduced. Boost the efficiency of many online-submitted containers' scheduling. | Using a variety of factors, choose the best node for each newly submitted container. To combine all criteria into a single rank, use the Technique for the Order of Prioritization by Similarity to Ideal Solution (TOPSIS) method. | Conducted experiments on different scenarios Used data from cloud infrastructure and user need | When compared to other container scheduling algorithms, KCSS enhances performance. | Can be used in industrial and academic fields for container-orchestration systems | Limited to the six key criteria used in the experiments Potential to expand the criteria and improve the performance further in future work. |
[22] | Presented a Kubernetes GPU scheduling mechanism based on topology. Increase resource efficiency and load distribution in the GPU cluster. | The foundation of the system is the established Kubernetes GPU scheduling mechanism. In a resource access cost tree, the topology of the GPU cluster is restored. The resource access cost tree is used to schedule and adapt various GPU resource application scenarios. | Tencent has employed GaiaGPU in actual production. | Improved resource utilization by about 10% Improved performance on load balance | Used in production at Tencent | – |
[23] | To develop a context-aware Kubernetes scheduler that takes into account physical, operational, and network parameters in order to improve service availability and performance in 5G edge computing | Real-time edge device data integration into the scheduler decision algorithm. | Comparison with the default Kubernetes scheduler | The suggested scheduler offers increased fault tolerance capabilities along with advanced orchestration and management. | 5G edge computing | – |
[24] | To develop a policy-driven meta-scheduler for Kubernetes clusters that enables efficient and fair resource allocation for multiple users | Dominant Resource Fairness (DRF) policy Additional fairness metrics based on task resource demand and average waiting time | – | The proposed meta-scheduler improves fairness in multi-tenant Kubernetes clusters | Kubernetes clusters | – |
[25] | To modify Kubernetes to be better suited for edge infrastructure, with a focus on network latency and self-healing capabilities | Custom Kubernetes scheduler that considers applications' delay constraints and edge reliability | – | The modified Kubernetes is better suited for edge infrastructure | Edge computing | – |
[26] | To improve Kubernetes scheduling for performance-sensitive containerized workflows, particularly in the context of 5G edge applications | NetMARKS is a cutting-edge method for scheduling Kubernetes pods that makes advantage of dynamic network metrics gathered with Istio Service Mesh. | Validated using different workloads and processing layouts | NetMARKS can save up to 50% of inter-node bandwidth while reducing application response times by up to 37%. | Kubernetes in 5G edge computing and machine-to-machine communication | – |
[27] | Create a feedback control approach for Kubernetes-based systems' elastic container provisioning of Web systems. | Combining a linear model with a varying-processing-rate queuing model can increase the accuracy of output errors. | Evaluated on a real Kubernetes cluster | When compared to cutting-edge algorithms, the suggested approach achieves the lowest percentage of SLA violation and the second-lowest cost. | Elastic container provisioning in Kubernetes-based systems | – |
[28] | Create a dynamic Kubernetes scheduler to help a heterogeneous cluster deploy Docker containers more effectively. Utilize past data on container execution to speed up task completion. | Developed the KubCG dynamic scheduling platform. Introduced a new scheduler that takes into account past data on container execution as well as the timetable for Kubernetes Pods. | Conducted different tests to validate the new algorithm | In experiments, KubCG was able to cut task completion times from 100 to 64% of the original time. | Used for the deployment of cloud-based services that require GPUs for tasks like deep learning and video processing. | Further testing and validation are needed to determine the effectiveness of the algorithm in a variety of scenarios. |
[29] | Describe a new method for arranging workloads in a Kubernetes cluster. | Framework model for hybrid shared-state scheduling. On the basis of the cluster's overall state, scheduling decisions are determined. | Tested proposed scheduler behavior under different scenarios, including failover/recovery in a deployed Kubernetes cluster | The suggested scheduler operates in circumstances like priority preemption or collocation interference. The features of both centralized and distributed scheduling frameworks are included in the scheduler. | Used in Kubernetes cluster to optimize resource utilization | Further testing and implementation needed to fully evaluate the effectiveness of the proposed scheduler. |
[30] | Develop and put into use KubeHICE, a container orchestrator for heterogeneous ISA architectures on cloud edge platforms. Assess the efficiency and performance of KubeHICE in handling heterogeneous-ISA clusters. | By using AIM and PAS, KubeHICE expands open source Kubernetes. AIM automatically locates a node that is appropriate for the ISAs that the containerized application supports. PAS schedules containers based on the computational capacity of cluster nodes. | KubeHICE was tested in several real-world scenarios. | KubeHICE is efficient in performance estimation and resource scheduling while adding no further overhead to container orchestration. When handling heterogeneity, KubeHICE can improve CPU utilization by up to 40%. | KubeHICE is beneficial for containerized applications in heterogeneous cloud-edge platforms | – |
[31] | Make the Kubernetes scheduler more efficient by incorporating the disk I/O load. | To improve the disk I/O balance between nodes, a dynamic scheduling approach called Balanced-Disk-IO-Priority (BDI) was proposed. Also presented the Balanced-CPU-Disk-IO-Priority (BCDI) dynamic scheduling algorithm to address the problem of unbalanced CPU and disk I/O load on a single node. | According to experimental findings, the BDI and BCDI algorithms are superior to the default scheduling algorithms in Kubernetes. | The load imbalance of CPU and disk I/O on a single node is resolved by the BDI and BCDI algorithms, which also enhance the disk I/O balance between nodes. | Can be used to improve the performance of Kubernetes in managing containerized applications | Further research may be needed to optimize the BDI and BCDI algorithms and evaluate their performance in different scenarios. |
[32] | Investigate how Serverless frameworks built on Kubernetes systems can schedule pods more efficiently in large-scale concurrent applications. | To further maximize the effectiveness of pod scheduling in Serverless cloud paradigms, a scheduling approach leveraging concurrent scheduling of the same pod is proposed. | Preliminary verification is performed to test the effectiveness of the proposed algorithm. | The suggested approach can significantly cut down on pod startup time while maintaining resource balance on each node. | The proposed algorithm is used to improve efficiency of pod scheduling in Serverless cloud paradigms. | The effectiveness is only verified via preliminary experiments. Also, the algorithm is only applicable to Serverless frameworks. |
[33] | Present a resource rescheduling and Kubernetes scheduler extension that combines QoE metrics into SLOs. | Use the QoE metric proposed in the ITU P.1203 standard Evaluate architecture using video streaming services co-located with other services | Evaluate architecture using video streaming services co-located with other services | The average QoE is increased by 50%. The average QoE was raised by 135% as a result of resource rescheduling. Over-provisioning was completely removed by the suggested architecture. | Improving QoE for cloud environments | Limited to the specific QoE metric used. Further research may be needed to evaluate the effectiveness of the proposed architecture with other QoE metrics. |
[34] | Enable the secure colocation of best-effort processes and latency-sensitive services in Kubernetes clusters to increase resource utilization. Flexibly divide resources among various workload categories. Improve hardware and software isolation capabilities for containers. | Based on Kubernetes extension mechanisms, Zeus was developed. Best-effort jobs are scheduled by Zeus based on actual server use. Through the coordination of hardware and software isolation elements, Zeus improves container isolation. | In a large-scale production setting, Zeus is assessed using latency-sensitive services and best-effort jobs. | Zeus can increase CPU usage from 15 to 60% on average without breaking SLO. Zeus can significantly increase how efficiently Kubernetes clusters use their resources. | Zeus can be used to improve the resource utilization of Kubernetes clusters | – |