Skip to content

Is there any way to disable archiving? #3432

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mzwettler2 opened this issue Oct 24, 2022 · 25 comments
Closed

Is there any way to disable archiving? #3432

mzwettler2 opened this issue Oct 24, 2022 · 25 comments

Comments

@mzwettler2
Copy link

I found PGO making archiving mandatory:

        // Send WAL files to all configured repositories when not in recovery.
        // - https://pgbackrest.org/user-guide.html#quickstart/configure-archiving
        // - https://pgbackrest.org/command.html#command-archive-push
        // - https://www.postgresql.org/docs/current/runtime-config-wal.html
        archive := `pgbackrest --stanza=` + DefaultStanzaName + ` archive-push "%p"`
        outParameters.Mandatory.Add("archive_mode", "on")
        outParameters.Mandatory.Add("archive_command", archive)

This also requires having a repo retention and at least on scheduled backup to get automatic retention management working.

There might be cases where I do not want archiving, eg. loading a large amount of data.
There might be even cases where I do not want archiving nor backups, eg. some kind of temporary test systems.

QUESTION: Is there any way to disable archiving?

Environment

Please provide the following details:

  • Platform: Anthos
  • Platform Version: 1.10 (afaik)
  • PGO Image Tag: postgres-operator:ubi8-5.1.1-0
  • Postgres Version: 14
  • Storage: vm ware csi
  • Number of Postgres clusters: several
@laurivosandi
Copy link

I also find it quite odd that I have to enable backups

@nfzv
Copy link

nfzv commented Nov 8, 2022

Any solutions found?

@laurivosandi
Copy link

I mangled the CRD definition and set minIndex: 0 in the relevant section :D

@mzwettler2
Copy link
Author

@laurivosandi

What did you do exactly? Hacking the PGO sources?

@mzwettler2
Copy link
Author

@cbandy

Maybe this would be an enhancement request?

@laurivosandi
Copy link

JFYI: After hacking the CRD-s looks like WAL keeps growing forever :D

So yeah authors should address this issue themselves

@benjaminjb
Copy link
Contributor

benjaminjb commented Jan 3, 2023

Just dropping a line here to note that we have a ticket in our backlog for disabling backups/archiving (for dev envs only, please!)

(This is a feature that has been requested in issues a few times, see also
#2531
#3152
I'm adding these links to help consolidate this issue in case I close any of these tickets.)

@Pluggi
Copy link

Pluggi commented Jan 25, 2023

What is the timeline for this feature? Do you have an estimate on when this will be available?

@sigwinch28
Copy link

Any update on this?

@bendilley
Copy link

I've found a way to do this temporarily:

kubectl exec hippo-data-ljc7-0 -c database -it -- bash
vi /pgdata/pg14/postgresql.conf # and set archive_mode = 'off'
pg_ctl restart

It's very useful when my initial datasource is a big pg_dump file.

After turning archive_mode back on, kicking-off a full backup seems like a good idea:

kubectl create job --from=cronjob/hippo-repo1-full backup-post-import

@laurivosandi
Copy link

Just move on to https://cloudnative-pg.io/

@mzwettler2
Copy link
Author

@bendilley

I also tried this but it did not work for me. PGO quickly noticed the change and restored the original state with archive_mode='on'. Maybe this might be deferred with the new "pause" flag of PGO 5.2.

Anyway, it is not useful to trick this outside of PGO or Kubernetes.

@bendilley
Copy link

@mzwettler2 this was with PGO 5.3.0 but I didn't explicitly pause anything.

I do think it's surprisingly difficult to initialise a cluster from a non-PGO datasource. I've even tried restoring from a pgbackrest backup in S3, but because the original source wasn't PGO-managed I still couldn't get it to work as a dataSource. pg_restore does work, but a large dataset will fill the disk with transaction logs way faster than they can be streamed by pgbackrest, at least to S3.

@laurivosandi thanks for the link - I'm open to alternatives!

@bendilley
Copy link

Now that I think about it, I did also merge the below patch of the patroni dynamicConfiguration in an earlier experiment - it didn't appear to have any effect, but it was still in place when I manually turned off archive_mode, which might explain why I got away with it 🤷🏻

spec:
  patroni:
    dynamicConfiguration:
      postgresql:
        parameters:
          archive_mode: 'off'

@mzwettler2
Copy link
Author

@bendilley

At the top of this thread you see that "archive_mode: on" is hardcoded within PGO. It was probably just a coincidence for you.

@bendilley
Copy link

bendilley commented May 9, 2023

Perhaps I should ask this in a different thread, but how are you supposed to initialise a PGO database from a non-PGO datasource??

@mzwettler2
Copy link
Author

@bendilley

Yes, somewhat off topic here. Anyway, I would go with pg_dump/pg_restore. Seems to be the cleanest way for me.

@bendilley
Copy link

pg_restore works fine for smaller datasets, but migrating a large database will fill up the disk with WAL faster than it can be backed-up, at least that's my experience.

@mzwettler2
Copy link
Author

@bendilley

Yep. The only "stable" dirty hack I found to prevent for a filesystem full error has been soft-linking a subdirectory of the "/pgbackrest/repo1/archive/db" on the repo pod to "/dev/null". PostgreSQL generates archives and saves them to trash. PGO does not recognize nor undo this filesystem "change".

@johannes-engler-mw
Copy link

Any news regarding this issue?

@mzwettler2
Copy link
Author

@benjaminjb

We are hopeing and praying for this feature. When will it come?

@bendilley
Copy link

Further to my earlier observation, I've found this combination of steps to be consistently effective:

kubectl patch postgrescluster hippo --type merge --patch-file patch-patroni-archive-mode.yml
kubectl exec hippo-data-ljc7-0 -c database -it -- bash
vi /pgdata/pg14/postgresql.conf # and set archive_mode = 'off'
pg_ctl restart
exit

where patch-patroni-archive-mode.yml is

spec:
  patroni:
    dynamicConfiguration:
      postgresql:
        parameters:
          archive_mode: 'off'

I've only ever done it while pg_restoreing a dump file. Presumably the setting wouldn't persist if the container was restarted.

After the restore is complete, I just reverse the changes and kick-off a full backup:

kubectl exec hippo-data-ljc7-0 -c database -it -- bash
vi /pgdata/pg14/postgresql.conf # and set archive_mode = 'on'
pg_ctl restart
exit

kubectl patch postgrescluster hippo --type json -p '[{"op": "remove", "path": "/spec/patroni/dynamicConfiguration/postgresql/parameters/archive_mode"}]'
kubectl create job --from=cronjob/hippo-repo1-full backup-post-import

@andrewlecuyer
Copy link
Collaborator

Just dropping a quick update on this thread to note that we are currently taking a look at some upcoming plans and changes to CPK to allow allow additional flexibility around disabling archiving (as requested in this issue).

Therefore, I simply wanted to note that this issue is definitely on our radar, and we will be in touch with additional information, details, etc. once available.

@dsessler7
Copy link
Contributor

I'm pleased to announce that support for optional backups has been released with Crunchy Postgres for Kubernetes v5.7.0. Please see the docs for more info. If you get a chance to try it out, please let us know what you think!

@mzwettler2
Copy link
Author

@dsessler7

Thanks for the feature. I will try it out when I find enough time.

I quickly skimmed the documentation. I would have preferred a simple spec.backups.status switch with a three-way logic.

on: archives + backups (default)
off: no archives + backups
delete: no archives + backups; delete all existing backup artifacts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants