-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
faster s2i container start idea #220
Comments
It would save time. On the other hand the image could be really big and it would slow deployments down. But it's user choise, so why not. |
You mean the re-deployments, where the data directory is already initialized? Well, in such case you still have the sql dump file baked into the image, and that would be per-se large. If we baked the binary datadir into the image instead of the sql file, it is likely we would get even smaller image... |
I might be missing something but is the "sql restore" something we are supporting right now? Or is it a future use case using the new hooks that we could possibly make easier for the users to achieve? |
Discussion-only topic :-)! I would mark this with question label, if I could. I'm trying to find usecase for myself for #208. Doing a development of python+postgresql project/app, my usecase would be:
I don't know whether (a) i can to dhat right now with supported container, (b) the #208 is required, or (c) some other pull request is needed. I think (c) is right, but I'm not sure. |
I was thinking mainly about storing data in image. I don't know details but I think image is transferred over network several times during app lifetime (pushed into registry, pulled into each node where image is run,...). So the bigger the image is the slower that process is. On the other hand if data are stored in persistent network volume data are transferred over network anyway. So maybe storing initial data in image isn't much slower:-) Also Open Shift (Online) restricts size of used persistent storage. I haven't found note about restricting of images size, so maybe this is even advantage:-D |
In my opinion this does not sound like something that the image should be taking care of. More like work for Openshift itself (project backup?). |
If you have plain text sql file with default data baked into the image, the space requirements are asymptotically equivalent. Of course, the db scenario might be that you fetch the data from the internet after db initialization, but that's not anymore task for s2i.
Hms, maybe. Do you have a link? |
Nope, Im not sure if such a feature exists yet. It was just an idea on how it should ideally work. |
Quick search revealed: But that does not seems like something we would want (backs up only project configuration) |
Full project snapshot would be nice, but that doesn't help with the use-ase I described -- because even though I want to have "backed" the initial state of database, the rest of the project goes forward during development... My thought on this is that we shouldn't support this directly, but it would be nice if we allowed users to implement this themselves (via s2i, once merged)... that is, it should be doable without ugly "workarounds". The |
Task for s2i could to process sql with the right postgresql version and the database "to the internet" (volume,...) |
@omron93 , can you elaborate on the use case more concretely? I'm not sure I follow. |
I can image this scenario (nothing detailed, only the way how I understand your goal):
Every build do:
And the image during the start could allow an option to obtain database files from somewhere (for example copy The benefit of s2i usage is that it will automatically create right binary initial database when image or sql data change! (what is wrong in this is that I think OpenShift don't support using persistent volumes during build... and reuse them in deployments) On the other hand |
@omron93 so basically instead of the result of an s2i build being just the image, it would be an image and an initial DB living somewhere and would get re-initialized every time the the image or input sql changes? That seems like a unnecesarily difficult way to achieve an always initialized database. Would rather go with Pavel's original proposal of baking the data directly into the image since that would work everywhere without too much hassle. Generally we could provide the users with some hooks that would be called during the assemble process if present and leave them the freedom to do whatever they need to do. |
Seems like the idea is pretty complicated; the assemble script is now too trivial So to make this happen, we would have to have (a) way to run
So can we vote whether this makes sense? (likes/dislikes, I could then follow up with PR adding the hook support, and preparing example leveraging this feature) |
- run 'initdb' from 'assemble', and bake the datadir into image - install hook which extracts the tarball when data is not initialized Fixes: sclorg#220
See #251 with WIP example. |
postgresql-pre-start hook example and test - run 'initdb' from 'assemble', and bake the datadir into image - install hook which extracts the tarball when data is not initialized Fixes: sclorg#220
postgresql-pre-start hook example and test - run 'initdb' from 'assemble', and bake the datadir into image - install hook which extracts the tarball when data is not initialized Fixes: sclorg#220
The current s2i proposal in #208 suffers from one ache, it is that even if user provides the initial database state in the "sql" dump to be restored after "initdb", it takes more than several seconds to get the database initialized.
I'm curious whether we could run initdb also during the run of assemble script, and than copy the data directory somewhere within the image -- IOW whether we could have the binary data directory baked into the built s2i image. Then, we could skip the initdb and just copy the backed directory under $PGDATA, and save a lot of time. WDYT?
The text was updated successfully, but these errors were encountered: