Last updated 6 years ago
Train ten models and ensemble them.
Keep good snapshots and ensemble them
Fancier version of snapshot ensemble: crazy high learning rate scheduler.
4. Moving average of the parameter vector.