Parameters used to get 94.714 top5 with efficientnet_b2?

Hey! I'm a researcher at OpenAI looking into trends in compute used by models. I'm excited to find this repo, since it's the only one with EfficientNet that claims to approximately reproduce the original performance.

I've got two runs going on machines with 8 P100's

```./distributed_train.sh 8 /tmp/imagenet-extracted/ --model efficientnet_b0 --lr 0.035 -b 64 --drop 0.2 --img-size 224 --sched step --epochs 550 --decay-epochs 2 --decay-rate 0.975 --opt rmsproptf -j 8 --warmup-epochs 5 --warmup-lr 1e-6 --weight-decay 1e-5 --opt-eps .001 --model-ema```

```./distributed_train.sh 8 /tmp/imagenet-extracted/ --model efficientnet_b2 --lr 0.0175 -b 32 --drop 0.2 --img-size 224 --sched step --epochs 550 --decay-epochs 2 --decay-rate 0.975 --opt rmsproptf -j 8 --warmup-epochs 5 --warmup-lr 1e-6 --weight-decay 1e-5 --opt-eps .001 --model-ema```

Where the only change I made from the parameters recommended [here](https://github.com/rwightman/pytorch-image-models/issues/11
) was scaling the learning rate you used, `.27` based on the difference in batch size

I'd be very interested in what specific learning rate plus other hyperparameters you used in your efficientnet-b2 run referenced in the ReadMe, and what has worked out best in b0 runs, since the learning rate above was given for a model family rather than b0 specifically.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Parameters used to get 94.714 top5 with efficientnet_b2? #27

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Parameters used to get 94.714 top5 with efficientnet_b2? #27

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions