Lookahead is a type of stochastic optimizer that iteratively updates two sets of weights: "fast" and "slow". Intuitively, the algorithm chooses a search direction by looking ahead at the sequence of fast weights generated by another optimizer.
Algorithm 1 Lookahead Optimizer
Require Initial parameters $\phi_0$, objective function $L$
Require Synchronization period $k$, slow weights step size $\alpha$, optimizer $A$
for $t=1, 2, \dots$
Synchronize parameters $\theta_{t,0} \gets \phi_{t-1}$
for $i=1, 2, \dots, k$
sample minibatch of data $d \sim \mathcal{D}$
$\theta_{t,i} \gets \theta_{t,i-1} + A(L, \theta_{t,i-1}, d)$
endfor
Perform outer update $\phi_t \gets \phi_{t-1} + \alpha (\theta_{t,k} - \phi_{t-1})$
endfor
return parameters $\phi$
Source: Lookahead Optimizer: k steps forward, 1 step backPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Machine Translation | 2 | 12.50% |
Translation | 2 | 12.50% |
Code Generation | 1 | 6.25% |
Instruction Following | 1 | 6.25% |
Language Modelling | 1 | 6.25% |
Mathematical Reasoning | 1 | 6.25% |
Domain Generalization | 1 | 6.25% |
Semantic Segmentation | 1 | 6.25% |
Text Generation | 1 | 6.25% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |