-
Notifications
You must be signed in to change notification settings - Fork 20
--finetune_forcasting overwrites models from pre-training #258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
What do you think @kacpnowak @clessig ? |
The intended solution is to have a new run_id when you fine-tune, ie. whenever any settings in the config are changed. This should be changed to true by default. In general, we will pre-train a model and fine-tune it into different directions. This is only possible if each "stage" has a unique run_id. It might seem more complicated than necessary but I think it's the only solution in the longer term. The heritage can than be tracked through a list/DAG of run_ids. |
OK, I will use new run_id's then. Thanks for the clarification |
Can you open a PR to fix the default value? |
I am in the process of writing a small design document so that agree on the semantics of the run_ids |
Uh oh!
There was an error while loading. Please reload this page.
Is your feature request related to a problem? Please describe.
When I do forecasting finetuning with "--finetune_forecast", it starts training from epoch0000 and overwrites all the models that were saved before during the pre-training.
So technically we lose the models from pre-training.
Technically, if we run finetuning longer epoches than pre-training, we will even lose the last pre-training checkpoint.
Describe the solution you'd like
Either
/models/bqcywx9m/mtm_training/bqcywx9m_epoch00054.chkpt
/models/bqcywx9m/finetuning/bqcywx9m_epoch00021.chkpt
...
Ideally I would prefer 2, then 1, then 3.
We can call these stage1, stage2 etc as well...
Describe alternatives you've considered
No response
Additional context
No response
Organisation
No response
The text was updated successfully, but these errors were encountered: