-
-
Notifications
You must be signed in to change notification settings - Fork 6
Compute datetime features on-the-fly in nowcasting_dataloader. Remove datetime features from the on-disk batches. #208
Comments
It would be nice, probably with every data_source, to have a 'compute-on-the-file' option. I realise that doing this general method is a little be harder than just changin DateTime to be on the fly, but it might be worth putting that extra effort in |
I must admit that I think we should lean towards dictating one way of doing things (at least, for now), as it keeps the code that much simpler. So, for example, in this instance, I think we should - at some point, if we all agree - switch entirely from saving datetime features to disk to computing them on the fly. Supporting both options does make the code a bit more complex and harder to maintain and, especially while we're doing ML R&D, I think it's fine to limit code complexity by dictating "this is the one way this works" |
I agree with @JackKelly, I think we should try to keep this simpler if we can. There already will be a lot of different config options for all the different modalites, etc. that reducing them when we can I think is a good idea, and would make our process a bit faster. Especially for something like these datetime features that should be quick and easy to compute on the fly, dictating that we won't save them to disk I think is fine. |
IIUC, datetime encoding has been implemented in nowcasting_dataloader by Jacob in PR openclimatefix/nowcasting_dataloader#7 All that's left to do is to remove sin and cos datetime features from nowcasting_dataset :) |
Yeah, once that is merged, it'll compute the datetime features as part of the position encoding |
It's possible that we might want to remove datetime features from the on-disk batches, and instead compute datetime features on-the-fly, not least because we'll want to experiment with a variety of different ways of encoding position when we really start experimenting with the perceiver IO models (see openclimatefix/nowcasting_utils#30)
These datetime features should be really fast to compute. And, if we start saving different 'modalities' to disk as different files, then it will almost certainly be faster to compute these on-the-fly rather than saving them into tiny files on disk.
The text was updated successfully, but these errors were encountered: