Skip to main content

Advances, Systems and Applications

Table 2 Optimization algorithm

From: Lightweight image classifier using dilated and depthwise separable convolutions

Algorithm: The optimization with the improved loss function

Input: Sample a minibatch of m examples from the training set {x(1),...,x(m)} with corresponding targets y(i).

Initialization: Step size ε=0.001, exponential decay rates for moment estimates ρ1=0.9, ρ2=0.999 and small constant δ used for numerical stabilization δ=10−8.

Output: Network parameters θ.

1. Initialize: Network parameters θ, 1st and 2nd moment variables s=0, r=0 and time step t=0.

2. While stopping criterion not met do.

Compute gradient: \(g\leftarrow \frac {1}{m}\nabla _{\theta } \sum _{i} L(f(x^{(i)};\theta),y^{(i)})\).

t=t+1.

Update biased first moment estimate:

sρ1s+(1−ρ1)g.

Update biased second moment estimate:

\(r\leftarrow {\rho _{2} r+(1-\rho _{2})g\bigodot g}\).

Correct bias in first moment: \({\overline {s}}\leftarrow {\frac {s}{(1-\rho _{1}^{t})}}\).

Correct bias in second moment: \({\overline {r}}\leftarrow {\frac {r}{(1-\rho _{2}^{t})}}\).

Compute update: \({\triangle \theta }=-\varepsilon \frac {\overline {s}}{\sqrt {\overline {r}}+\delta }\).

Apply update: θ=θ+θ.

3. end while.

4. Return θ.