Multi-dimensional resource allocation strategy for LEO satellite communication uplinks based on deep reinforcement learning

Journal of Cloud Computing

Advances, Systems and Applications

Table 1 Resource allocation algorithm

DQN-based joint channel power allocation algorithm
1 Initialize scene parameters and algorithm parameters
2 Obtain channel assignment information, user distribution information, and new user service information
3 for episode = 1:max_ episode
4 Initialize state space s_t
5 State Reconfiguration \({s}_{t}^{*}\)
6 for t = 1,2,3……,T -1
7 Select action by \(\varepsilon -{\text{greedy}}\) algorithm
8 Execute the action \({a}_{t}\), get the reward value\({r}_{t}\), and observe the next state \({s}_{t+1}\)
9 Reconstruct \({s}_{t+1}\) as \({s}_{t+1}^{}\) and put experience data \(\left({s}_{t}^{}, {a}_{t},{r}_{t},{s}_{t+1}^{*}\right)\) into the replay experience pool
10 Randomly selected sample data from the replay experience pool
11 Calculate the error function
12 Updating Q-network parameters using gradient descent \(\omega\)
13 Update the target Q network parameters \({\omega }^{-}\)
14 end
15 end
16 Get deep reinforcement learning network parameters
17 Output the channel and power assigned to each new user