In this section, we first examine the optimality of PACCP by simulation as well as a small testbed implementation, and then test the price-performance consistency of PACCP for both pricing models in a large datacenter based on real-world workloads.
Optimality test by simulation
We first test PACCP in terms of the user utility maximization and optimal rate allocation by simulation. We use the same datacenter topology, i.e., a 6x5 leaf-spine network with 40 hosts in each rack. We first consider a simple case, i.e., each host has one outgoing flow and one incoming flow. So there are a total number of 40 outgoing flows (20 BE, 10 DS and 10 MRG flows) in each rack, with 8 of them (i.e., 4 BE, 2 DS, and 2 MRG flows) going to exactly one of the other 5 racks. To test the optimality of PACCP, we assume that all the flows are extremely long-lived with unlimited amount of data to send. With this setup, 40 flows from each rack share a total of 200 Gbps (i.e., five 40 Gbps links) outgoing bandwidth. We first set the pairs of parameters to be (0, 1), (0, 2) and (2Gbps, 1) for BE, DS and MRG flows, respectively. In this case, the optimal flow rate allocation for each 40 Gbps leaf-spine link are 4 Gbps, 8 Gbps and 4 Gbps for each of the 4 BE, 2 DS and 2 MRG flows, respectively. We also consider another case where the only difference from the previous case is that the pair of parameters for MRG flows is changed to (7Gbps, 2). In this case, the optimal flow rates for each of the BE, DS and MRG flows are 3.25, 6.5 and 7 Gbps, respectively. Since for both cases, each VM only sources one flow, the flow rate allocations are the same for both VBP and FBP.
Figures 4 (a) and (c) show the sum of user utility from simulation and the optimal one, V in Eq. 6, normalized to one. As we can see, the simulated one closely matches with, but is slightly lower than the optimal one for both cases. The reason why it is always lower than the optimal one is that for any transport congestion control protocol including PACCP, the aggregate flow rate cannot achieve 100% link utilization all the time, due to congestion feedback delay and finite granular control.
The rates of the three CoS flows, each averaged over all the flows in the same CoS, are depicted in Figs. 4 (b) and (d). The average flow rates over time for BE, DS and MRG flows are 3.72/6.21/4.09 Gbps and 2.98/5.08/7.14 Gbps for the cases of θ=2 Gbps and 7 Gbps, respectively. The results indicate that the rates of MRG flows are always above the minimum guaranteed rate θ. The rate ratio between DS and BE is about 1.67/1.71, smaller than the optimal ratio 2, for both cases. This is because the optimal ones are obtained based on the assumption that the PACCP controllers for both BE and DS flows sharing the same bottleneck links will sense the congestion simultaneously. In practice, however, a flow of higher rate may sense more congestions than a flow of lower rate, and hence DS flows will incur more rate reduction. This explains why the flow rate ratio of the DS and BE flows is less than the optimal one.
To further test VBP, we create two types of hosts by adding one additional outgoing flow to each of the 10 BE, 5 DS and 5 MRG hosts in each rack, forming a second type of hosts, leaving the other half of hosts in each rack unchanged. As a result, each of this type of hosts now has 2 outgoing flows. The flows generated from this type of hosts are called BE-2, DS-2 and MRG-2 flows, and the flows generated from the other hosts are denoted as BE-1, DS-1 and MRG-1 flows.
Now we consider the pairs of parameters (0, 1), (0, 2) and (7 Gbps, 2) for BE, DS and MRG hosts, respectively. This means that the pairs of parameters for BE-1, DS-1 and MRG-1 flows are (0,1), (0,2) and (7Gbps, 2), respectively, and the pairs of parameters for BE-2, DS-2 and MRG-2 flows are (0, 0.5), (0,1) and (3.5 Gbps, 1), respectively. As a result, the optimal flow rate allocation is 3.25 Gbps, 6.5 Gbps and 7 Gbps for BE-1, DS-1 and MRG-1 flows, respectively, and 1.625 Gps, 3.25 Gps and 3.5 Gbps for BE-2, DS-2 and MRG-2 flows, respectively.
Figure 5 (a) shows the results for the normalized user utility. Again, the simulated one is very close to the optimal one. Figure 5 (b) presents the simulated flow rates of individual types and CoSes. The average flow rates for BE-1/2, DS-1/2 and MRG1/2 are found to be 2.74/1.6, 4.65/2.66 and 7.16/3.85, respectively. Similar to the previous case, the rates of MRG flows are always above their required minimum rates and the flow rate ratios between DS-1 and BE-1, and DS-2 and BE-2 flows are about 1.66 and 1.7, lower than the optimal value of 2. The flow rate ratio between BE-1 and BE-2, DS-1 and DS-2, and MRG-1 and MRG-2 are 1.65, 1.78 and 1.86, respectively, also lower than the optimal value of 2, for the same reason explained earlier.
For the current case and under VBP, flows of the same CoS and from different VMs should be allocated the same total rate. For example, a DS-1 flow originated from one host should receive the same flow rate as the sum of two DS-2 flows originated from another host. Figure 5 (c) shows the testing results for the average flow rates originated from different types of hosts. We can see that the average flow rates from the two types of hosts of the same CoS are reasonably close to each other, with the rates from type 2 slightly higher than that of type 1. This is caused by the fact that each of the two flow from a type 2 host has a smaller flow rate than that of a flow from a type 1 host, and hence they sense less congestion signals, as explained earlier.
Optimality test in a real testbed
We implement our PACCP in the Linux kernel. Our Linux code is modified based on the TCP Reno. In PACCP, the minimum guaranteed rate θ and the flow utility weight ω are passed from the user space to the Kernel space through the standard device control function, ioctl(). With the pair of values θ and ω, a minimum window size Wmin is calculated using θ and the measured in RTT (which is already measured in TCP Reno). The only difference between PACCP and TCP Reno is that PACCP needs firstly to compare the current congestion window size with Wmin, and then compute the adjusted congestion window using Eqs. (17) and (18) for each received acknowledgement.
A 3x1 leaf-spine (i.e., 3 leaf nodes and 1 spine node) datacenter network topology as shown in Fig. 6 is set up using four Dell N4032F switches. Each link has 1 Gbps bandwidth. VMs 1, 4 and 7 are BE hosts, VMs 2, 5 and 8 are DS hosts and VMs 3, 6 and 9 are MRG hosts. Each VM originates 1 long-lived flow. Hence the flow rate allocation is the same, regardless whether VBP or FBP is in use. The destinations of VM i are (i+3)%9 (for i=1, 2,..., 9). The pairs of parameters are set at (0,1), (0, 2) and (400Mbps, 2) for BE, DS and MRG flows, respectively. With this setup, the optimal flow rates are 200 Mbps, 400 Mbps and 400 Mbps for BE, DS and MRG flows, respectively.
Figure 7 shows the average flow rates for flows of the three CoSes. The average rate of MRG flows is about 410 Mbps, above the minimum guaranteed rate 400 Mbps. The average rate of DS and BE flows are about 310 Mbps and 180 Mbps, respectively, resulting in a flow ratio of about 1.7, less than the optimal ratio 2. These results agree with the simulation results.
In summary, both simulation and testbed testing results indicate that with PACCP, MRG flows have high chance to meet their minimum guaranteed rates. Moreover, the DS flows can indeed gain more bandwidths when they compete with BE flows, which however, are consistently lower than the optimal ones. This implies that the pricing parameters for DS flows in both VBP and FBP need to be adjusted to reflect the biased relative flow rate. We found that a DS flow with ω=2 achieves about 1.6 times the flow rate as a BE flow at high load. Although more detailed study at higher weight values are warranted, this observation does suggest that one may set P1 (\(P^{s}_{1}\)) at some smaller value than P0(\(P^{s}_{0}\)), e.g., 0.6 P0 (\(P^{s}_{0}\)) for VBP (FBP). Note that DS is meant to outperform BE at high load. At low load, they offer similar performance. Hence, DS related pricing must be determined at the high load.
The both simulation and real testbed test results show that PACCP can closely achieve the expected theoretical rate allocation. Hence the pair of parameters (θ, ω) can be real effective to map the flow prices.
Performance evaluation with real datacenter workloads
Now the flow allocations using PACCP with the two pricing models are tested using real datacenter traffic workloads, i.e., Websearch [16] and Data-mining [36], as input. The focus is placed on the testing of the price-performance consistency, i.e., whether the flow rate allocated by PACCP matches the expected rate allocation based on the two pricing models.
Again we use the same network setup as the previous one, i.e., a 6x5 leaf-spine network topology with each rack having 40 hosts. The overall FCT and FCTs for small, medium and big flows are measured as performance metrics. The overall DMR and DMRs for small, medium and big flows for flows with deadlines are also measured as the performance metrics. Although BE and DS flows are deadline unaware, we compare the DMRs for flows with deadline using the BE and DS services against that using the MRG service to reveal how much MRG can help improve DMR over the other two.
VBP
We first examine the performance of PACCP with VBP. Consider the case where there are 20, 10 and 10 hosts in each rack running BE, DS and MRG flows, respectively. The pairs of parameters are set at (0, 1), (0,2) and (5 Gbps, 2) for BE, DS and MRG flows, respectively. The flow deadline for each of na active outgoing MRG flows at a host is set at the flow size divided by 5/ na Gbps. We assume that the flow deadline is lower bounded at 1 ms, as the PACCP connection setup time is taken into account.
We first run PACCP using the Websearch workload [16]. Figures 8 (a) and (b) present the average FCTs for the overall, small, medium and big flows (normalized to the FCT for the BE flows). We see that both DS and MRG flows indeed perform better than BE flows for all load cases. For small and medium DS/MRG flows, their FCTs are less than 0.8 times (i.e., the flow rates are 1.25 times higher) of BE flows at all load cases. As these flows are short-lived flows and may be completed before they reach their optimal allocated flow rate, the performance gains for these flows come from the faster rate increase with the help of ω and rcos (i.e., θ). The results indicate that PACCP is really effective for short-lived flows to enabling price based rate allocation. At light loads, the difference for big flows is small, about 0.9 times of that for the BE flows. This is because at light loads, there is enough bandwidth to accommodate the desired user utilities for all the individual flows of different CoSes, which hardly need to compete against one another for the network resource. Hence long flows have enough time to fully explore the available bandwidth, making the small performance gains.
The performance gains increase quickly as the network load increases. In the high load region (e.g., at 80%), the FCTs for the overall/small/medium/big DS and MRG flows are about 0.62/0.62/0.6/0.63 and 0.61/0.62/0.58/0.62 times of BE flows, respectively. In other words, the average flow rate allocated to DS flows versus BE flows is about 1.6 and 1.7 times, lower than the optimal ratio of 2, which agrees with the findings for the previous long-lived flow cases. MRG flows perform slightly better than DS flows for all cases. The performance gains are about 5% for small and medium flows, but very little for the overall and big flows. The close performance between the DS and MRG flows arises because both DS and MRG have the same weight value of 2. Hence, they are expected to receive equal flow rate allocation, provided that the minimum guaranteed rates for MRG flows are satisfied, which is indeed the case. The fact that MRG flows perform slightly better is because MRG flows open up their send windows faster than DS flows until the flow rates reach their minimum guaranteed rates.
Figures 8 (c) and (d) show the DMRs for the overall, small, medium and big flows. The DMRs for MRG flows are always higher than BE and DS flows. The overall DMR for MRG flows is above 90% even at high load, whereas the DMR for BE flows is below 50%. While the DMRs for medium and big MRG flows are above 90%, the corresponding DMRs for DS and BE flows drop below 80% and 50%, respectively. This clearly demonstrates the importance of MRG in providing high probability of meeting flow deadlines.
Now let us examine the performance of PACCP with VBP using the Data-mining workload [36] as input. Figure 9 depicts the results, which are similar to those for the Websearch workload case. However, we note that the overall FCT performance gains are reduced. The FCT gain for small flow is almost a constant for all load cases for the following reason. Most of the small flows in the Data-ming workload are composed of only a few packets (about 40% flows have a single packet and about 80% flows have no more than 6 packets), which finish in just one or two RTTs, and hence the fast rate increase has limited effect on these flows. The DMRs for the medium and big flows are similar to those for the Websearch case. However, the DMRs for small MRG, DS, and BE flows are high, above 98%, 95% and 90%, respectively, due to the high chance for small flows to finish within the 1 ms deadline.
The above results further ascertain that PACCP with VBP can indeed provide high probability of providing minimum rate guarantee and allow pricing in proportion to the flow rate allocation.
FBP
Finally, we evaluate the performance of PACCP with FBP. For FBP, a customer can run flows of different CoSes in a VM. In our simulation, an outgoing flow from a host has 60% chance to be a BE flow, 20% chance to be a DS flow, and 20% chance to be an MRG flow. The pair of parameters for BE, DS, and MRG flows are set at (0,1), (0, 2) and (2.5 Gbps, 2), respectively. In FBP, a VM may have all the three types of flows at the same time.
Figures 10 and 11 show the results by using Websearch workload and Data-mining workload, respectively. Overall, the results are similar to these of PACCP with VBP model in both workloads. But the performance gains including both FCTs and DMRs for DS and MRG flows at high loads are slightly less than those in PACCP with VBP model. The utility weight ω in a host in VBP model is the sum of weights from all sending flows, this results in smaller ω used by flows in VBP model. A flow with smaller ω means that its flow increase rate is smaller than that of a flow with a larger ω. Hence the number of congestions in PACCP with ABP model is less than that in FBP with FBP model. As the average flow rate for DS or MRG flows is much higher than that for BE flow, DS and MRG flows sense more congestions than BE flows do, thus making the performance gains in VBP model are closer to the idea case (i.e., 0.5), and hence greater performance gains than that in PACCP with FBP model. The results indicate that PACCP with both pricing models can provide price performance consistency applied in real datacenters.
The above results demonstrate that the proposed PACCP can indeed enable price-aware flow rate allocation in cloud, particularly for short-lived flows, including soft minimum rate guarantee and relative additional rate allocation, commensurate with two pricing models. PACCP is a fully distributed control protocol and only needs software upgrading in the end hosts and does not involve any core switches, and hence it is readily to be implemented in current datacenters.