Lightweight image classifier using dilated and depthwise separable convolutions

Journal of Cloud Computing

Advances, Systems and Applications

Table 2 Optimization algorithm

Algorithm: The optimization with the improved loss function
Input: Sample a minibatch of m examples from the training set {x⁽¹⁾,...,x^(m)} with corresponding targets y⁽ⁱ⁾.
Initialization: Step size ε=0.001, exponential decay rates for moment estimates ρ₁=0.9, ρ₂=0.999 and small constant δ used for numerical stabilization δ=10⁻⁸.
Output: Network parameters θ.
1. Initialize: Network parameters θ, 1st and 2nd moment variables s=0, r=0 and time step t=0.
2. While stopping criterion not met do.
Compute gradient: \(g\leftarrow \frac {1}{m}\nabla _{\theta } \sum _{i} L(f(x^{(i)};\theta),y^{(i)})\).
t=t+1.
Update biased first moment estimate:
s←ρ₁s+(1−ρ₁)g.
Update biased second moment estimate:
\(r\leftarrow {\rho _{2} r+(1-\rho _{2})g\bigodot g}\).
Correct bias in first moment: \({\overline {s}}\leftarrow {\frac {s}{(1-\rho _{1}^{t})}}\).
Correct bias in second moment: \({\overline {r}}\leftarrow {\frac {r}{(1-\rho _{2}^{t})}}\).
Compute update: \({\triangle \theta }=-\varepsilon \frac {\overline {s}}{\sqrt {\overline {r}}+\delta }\).
Apply update: θ=θ+△θ.
3. end while.
4. Return θ.