Advances, Systems and Applications
From: Predictive mobility and cost-aware flow placement in SDN-based IoT networks: a Q-learning approach
Parameter | Description |
---|---|
wt(i) | weight of the ith training example at the tth iteration |
H(x) | final strong classifier |
acc | accuracy of classification |
c(yi,yj) | tcost function, which assigns a cost to the event of predicting class yj when the true class is yi |
N | total number of samples or instances in the dataset |
p | number of negative samples |
L(x,y) | real loss associated with a prediction for a given class y when the input is x |
Ï„ | index or identifier for the weak learners in the ensemble that the AdaBoost algorithm generates |
ατ | weight assigned to the τ-th classifier in the ensemble |
hτ(x) | hypothesis or prediction made by the τ-th classifier for the sample x |
L | set of all possible states in the environment |
A | a set of all possible actions that the agent can take in a given state |
R | reward received after transitioning from one state to another due to an action taken by the agent |
P | probability of transitioning from one state to another |
H | a dataset or a set of data points that encapsulate the historical movement of the end-device |
li | ith position of the device in a sequence of positions |
ti | arrival time of the device corresponding to ith position |
wij | weight of the visit from li to lj |
hi | ith basic classifier |
Pij | transition probability from li to lj, i ≠ j |
α | learning rate |
γ | discount factor |
m | the total number of the training data |
t | current iteration or round of the boosting process |
Q(s,a) | expected cumulative reward for taking action a in state s |
Q′ (s′,a′) | estimated maximum reward for the next state s′ over all possible actions a′. |
R(s,a) | reward received after taking action a in state s |