Simple PostgreSQL to Backblaze B2 Backup Kubernetes CronJob
A database without a backup is not any better than having no database at all. You never know when your server might fail or you accidentally push schema changes that delete all existing data (ask me how I know 😅). This article explores a simple yet robust solution for automating PostgreSQL backups using Kubernetes CronJobs to dump a complete postgres instance and uploads the dump to Backblaze B2 cloud storage storage.
The same approach works with other storage options like AWS S3, MinIO (if you want to stay all OSS). B2 happens to be quite cheap and object-storage in general works great for my approach.
CronJobs
Kubernetes CronJobs provide a flexible way to schedule tasks at fixed intervals, making them ideal for automated processes (like backups). They build upon Kubernetes Jobs and add the ability to run at regular intervals. For scheduling, CronJobs us the beloved Cron format to define when the job should run. This format is so great, that I always have to refer to crontab.guru to understand what I’m doing…
Anyway, let’s look at a simple CronJob manifest.
apiVersion: batch/v1
kind: CronJob
metadata:
name: test-cron-job
spec:
schedule: "* * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: test
image: busybox:latest
imagePullPolicy: IfNotPresent
command: ["/bin/sh", "-c"]
args:
- |
date
echo "Hello, World!"
restartPolicy: OnFailure
This CronJob runs every minute and prints the current date and “Hello, World!” to the logs.
If you’re already familiar with ’normal’ Jobs, you’ll notice that the CronJob only adds the schedule
field and the jobTemplate
field (which in turn contains the Job spec).
CronJob for Prostgres Backups
As you’ll shortly see, the CronJob for PostgreSQL backups is a bit more complex than the simple example above. To make digesting it easier I’ll go over the most important parts before showing the full manifest.
-
Schedule: The CronJob is scheduled to run at 2 AM every day. That’s often enough and at a time where my database usually doesn’t experience any load. You might want to adjust this to your needs.
schedule: "0 2 * * *"
-
Dumping postgres: This container uses the latest posgres image (which contains all the postgres tools we need) to dump the database. Setting the environment variable
PGPASSWORD
allows our default userpostgres
to connect to the database.In this simple example, I’ve hardcoded the user and the name of the database service in our cluster. You might want to extract those values into a secret or configmap. The password is already sourced from a secret. If you look closely, you notice that this will only dump the databases, run the dump through gzip and store it in a volume mounted at
/mnt/backup
. No uploading to B2 yet.containers: - name: postgres-backup image: postgres:latest env: - name: PGPASSWORD valueFrom: secretKeyRef: name: postgres-secret key: POSTGRES_PASSWORD command: ["/bin/sh", "-c"] args: - | BACKUP_FILE="pg_backup_$(date +%Y%m%d_%H%M%S).sql.gz" echo "Creating backup $BACKUP_FILE..." pg_dumpall -h postgres.default.svc.cluster.local -U postgres | gzip > /mnt/backup/$BACKUP_FILE volumeMounts: - name: backup-storage mountPath: /mnt/backup
-
Uploading dump: Since uploading to B2 is easiest when using the
b2
(orb2v4
) CLI tool and this is not present in the postgres image, I added a second container that takes the dump and uploads it. This container is configured with an application key and its id as well as the bucket name to upload to. All these values are sourced from a secret. The actual script is quite simple: it takes the latest dump from the shared backup volume and pushes it to B2 using the CLI.containers: ... - name: b2-uploader image: backblazeit/b2:latest env: - name: B2_APPLICATION_KEY_ID valueFrom: secretKeyRef: name: db-backup-secret key: access_key_id - name: B2_APPLICATION_KEY valueFrom: secretKeyRef: name: db-backup-secret key: application_key - name: B2_BUCKET_NAME valueFrom: secretKeyRef: name: db-backup-secret key: bucket_name command: ["/bin/sh", "-c"] args: - | BACKUP_FILE=$(ls /mnt/backup/pg_backup_*.sql.gz | sort -r | head -n1) b2v4 file upload $B2_BUCKET_NAME "$BACKUP_FILE" "$(basename $BACKUP_FILE)" volumeMounts: - name: backup-storage mountPath: /mnt/backup
Putting it all together
One crucial piece is sill missing to make this work: we need to coordinate the two containers so that the uploader only starts once the dump is finished. The simples solution I could think of was to write a special file to the shared volume once the dump is completed and have the uploader wait for this file to appear using a simple while loop.
apiVersion: batch/v1
kind: CronJob
metadata:
name: postgres-backup
spec:
schedule: "0 2 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: postgres-backup
image: postgres:latest
env:
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: POSTGRES_PASSWORD_ROOT
command: ["/bin/sh", "-c"]
args:
- |
BACKUP_FILE="pg_backup_$(date +%Y%m%d_%H%M%S).sql.gz"
echo "Creating backup $BACKUP_FILE..."
pg_dumpall -h postgres.default.svc.cluster.local -U postgres | gzip > /mnt/backup/$BACKUP_FILE
echo "Backup completed."
# signal the next container that the dump is completed
touch /mnt/backup/.done
volumeMounts:
- name: backup-storage
mountPath: /mnt/backup
- name: b2-uploader
image: backblazeit/b2:latest
env:
- name: B2_APPLICATION_KEY_ID
valueFrom:
secretKeyRef:
name: db-backup-secret
key: access_key_id
- name: B2_APPLICATION_KEY
valueFrom:
secretKeyRef:
name: db-backup-secret
key: application_key
- name: B2_BUCKET_NAME
valueFrom:
secretKeyRef:
name: db-backup-secret
key: bucket_name
command: ["/bin/sh", "-c"]
args:
- |
# Check every 5 seconds if the dump is completed
while [ ! -f /mnt/backup/.done ]; do
sleep 5
done
BACKUP_FILE=$(ls /mnt/backup/pg_backup_*.sql.gz)
b2v4 file upload $B2_BUCKET_NAME "$BACKUP_FILE" "$(basename $BACKUP_FILE)"
volumeMounts:
- name: backup-storage
mountPath: /mnt/backup
restartPolicy: OnFailure
volumes:
- name: backup-storage
emptyDir: {}
Conclusion
Implementing an automated backup solution using CronJobs like shown above can help to increase resilience and allow for quick recovery in case of data loss.
Building upon this foundation, it’s not rocket science to add an init script to your database that checks for backup availability and restores the latest dump if necessary.
You could also move restoring to a init container (which is what I’d recommend, since you can again use the b2 image to download the dump).
Or you could add features like only keeping the last n
backups or sending notifications if backups fail.