• /AllReduce: ResNet-50 models trained with AllReduce SGD

    • Training Details:
      • Seeds: 810975, 810976, 810977
      • Epoch: 90
      • Max LR: 1.0
      • LR scheduler: Cosine annealing with a linear warm-up in the first 5 epochs
      • Batch size: 1024
      • Momentum: 0.875
    • Results:
      • Top-1 Accuracy: 77.5327 ± 0.1685
      • Top-5 Accuracy: 93.6127 ± 0.0998
      • Val Loss: 1.9389 ± 0.0094
  • /DSGDm-8-ring: ResNet-50 models trained with decentralized SGD with momentum

    • Training Details:
      • Seeds: 810975, 810976, 810977
      • Epoch: 90
      • Max LR: 1.0
      • LR scheduler: Cosine annealing with a linear warm-up in the first 5 epochs
      • Batch size: 1024
      • Momentum: 0.875
      • Number of workers: 8
      • Communication topology: one-peer ring (time-varying topology)
    • Results:
      • Top-1 Accuracy: 77.4233 ± 0.1227
      • Top-5 Accuracy: 93.5407 ± 0.0546
      • Val Loss: 1.9332 ± 0.0031
  • /DSGDm-8-complete: ResNet-50 models trained with decentralized SGD with momentum

    • Training Details:
      • Seeds: 810975, 810976, 810977
      • Epoch: 90
      • Max LR: 1.0
      • LR scheduler: Cosine annealing with a linear warm-up in the first 5 epochs
      • Batch size: 1024
      • Momentum: 0.875
      • Number of workers: 8
      • Communication topology: complete
    • Results:
      • Top-1 Accuracy: 77.4440 ± 0.0694
      • Top-5 Accuracy: 93.6567 ± 0.0197
      • Val Loss: 1.9361 ± 0.0040
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train zesen-kth/ResNet50