Advances, Systems and Applications
Description | Parameter | Value |
---|---|---|
Learning rate | α | 0.01 |
Discount factor | γ | 0.9 |
Trace decay rate | λ | 0.5 |
Initial temperature | θ | 0.9 |
Number of hidden layers | Â | 2 |
Number of nodes for the first hidden layer | Â | 20 |
Number of nodes for the second hidden layer | Â | 20 |
Activation function for hidden layers | Â | ReLU |
Maximum replay memory size | |D| | 500 |
Minibatch size | Â | 300 |
Parameter updating frequency for target DQN | ι | 10 |