Skip to main content

Advances, Systems and Applications

Table 2 Optimization algorithm

From: Lightweight image classifier using dilated and depthwise separable convolutions

Algorithm: The optimization with the improved loss function
Input: Sample a minibatch of m examples from the training set {x(1),...,x(m)} with corresponding targets y(i).
Initialization: Step size ε=0.001, exponential decay rates for moment estimates ρ1=0.9, ρ2=0.999 and small constant δ used for numerical stabilization δ=10−8.
Output: Network parameters θ.
1. Initialize: Network parameters θ, 1st and 2nd moment variables s=0, r=0 and time step t=0.
2. While stopping criterion not met do.
Compute gradient: \(g\leftarrow \frac {1}{m}\nabla _{\theta } \sum _{i} L(f(x^{(i)};\theta),y^{(i)})\).
t=t+1.
Update biased first moment estimate:
sρ1s+(1−ρ1)g.
Update biased second moment estimate:
\(r\leftarrow {\rho _{2} r+(1-\rho _{2})g\bigodot g}\).
Correct bias in first moment: \({\overline {s}}\leftarrow {\frac {s}{(1-\rho _{1}^{t})}}\).
Correct bias in second moment: \({\overline {r}}\leftarrow {\frac {r}{(1-\rho _{2}^{t})}}\).
Compute update: \({\triangle \theta }=-\varepsilon \frac {\overline {s}}{\sqrt {\overline {r}}+\delta }\).
Apply update: θ=θ+θ.
3. end while.
4. Return θ.