PostgreSQL High Availability with Crunchy Data PGO: Enterprise Kubernetes Deployment
Enterprise PostgreSQL HA with Crunchy Data PGO on Kubernetes
Running PostgreSQL in production on Kubernetes demands more than a StatefulSet with a persistent volume. You need automated failover, continuous backup and point-in-time recovery, connection pooling, TLS encryption, monitoring, and the ability to deploy consistently across cloud providers and bare metal. Crunchy Data PGO (Postgres Operator) v5 delivers all of this through a single Kubernetes-native custom resource — the PostgresCluster CRD — backed by battle-tested components: Patroni for consensus-based HA, pgBackRest for enterprise backup, PgBouncer for connection pooling, and pgMonitor for Prometheus-compatible observability.
This guide covers everything required to take a PGO-managed PostgreSQL cluster from initial deployment to production-grade operation across AWS EKS, Azure AKS, GCP GKE, and bare metal k3s with Rancher. Every section includes concrete YAML, Helm values, and operational procedures you can adapt to your environment.
Crunchy Data PGO v5 Architecture
PGO v5 is a complete rewrite of the Crunchy Postgres Operator. It replaces the previous pgcluster/pgreplica/pgpolicy CRDs with a single PostgresCluster resource that declaratively describes every aspect of a PostgreSQL deployment. The operator watches for changes to this resource and reconciles the underlying Kubernetes objects — StatefulSets, Services, ConfigMaps, Secrets, Jobs — to match the desired state.
The architecture is built on four pillars. Patroni runs as a sidecar in every PostgreSQL pod and manages leader election, replication topology, and automatic failover using Kubernetes-native distributed consensus. pgBackRest handles full, differential, and incremental backups plus continuous WAL archiving to object storage (S3, GCS, Azure Blob) or local PVCs. PgBouncer provides lightweight connection pooling that shields PostgreSQL from connection storms. pgMonitor exposes PostgreSQL metrics through a Prometheus exporter sidecar for integration with your existing observability stack.
The operator itself is stateless — all persistent state lives in the PostgreSQL cluster and its backup repository. This means you can upgrade or restart the operator without affecting running databases. The operator reconciliation loop is idempotent: applying the same PostgresCluster spec multiple times produces the same set of Kubernetes objects.
Installing PGO v5
PGO v5 can be installed via Helm or direct kubectl manifests. The Helm approach is preferred for production because it integrates cleanly with GitOps workflows and provides version-controlled upgrades.
# Add the Crunchy Data Helm repository
helm repo add crunchydata https://charts.crunchydata.com
helm repo update
# Install PGO into its own namespace
helm install pgo crunchydata/pgo \
--namespace postgres-operator \
--create-namespace \
--set controllerImages.cluster=registry.crunchydata.com/crunchydata/postgres-operator:ubi8-5.7.2-0 \
--set relatedImages.postgres_16=registry.crunchydata.com/crunchydata/crunchy-postgres:ubi8-16.6-1 \
--set relatedImages.pgbackrest=registry.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.54.1-0 \
--set relatedImages.pgbouncer=registry.crunchydata.com/crunchydata/crunchy-pgbouncer:ubi8-1.23.1-0 \
--set relatedImages.pgexporter=registry.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi8-0.16.0-0
For air-gapped environments, mirror the required images to your internal registry and update the relatedImages values accordingly. PGO will only pull images specified in the Helm values or the PostgresCluster spec — it never reaches out to external registries at runtime unless explicitly configured to do so.
# Verify the operator is running
kubectl get pods -n postgres-operator
NAME READY STATUS RESTARTS AGE
pgo-controller-manager-7d8f9b6c4-xk2pt 1/1 Running 0 45s
PostgresCluster CRD Specification
The PostgresCluster CRD is the single declarative surface for your entire PostgreSQL deployment. Below is a production-ready specification that demonstrates the key sections. Each section is discussed in detail in subsequent parts of this guide.
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: production-db
namespace: databases
spec:
postgresVersion: 16
image: registry.crunchydata.com/crunchydata/crunchy-postgres:ubi8-16.6-1
instances:
- name: pgha1
replicas: 3
resources:
requests:
cpu: "2"
memory: 8Gi
limits:
cpu: "4"
memory: 16Gi
dataVolumeClaimSpec:
storageClassName: gp3-csi
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 100Gi
walVolumeClaimSpec:
storageClassName: gp3-csi
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 20Gi
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: production-db
tolerations:
- key: "workload"
operator: "Equal"
value: "database"
effect: "NoSchedule"
backups:
pgbackrest:
image: registry.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.54.1-0
configuration:
- secret:
name: pgbackrest-s3-creds
global:
repo1-retention-full: "7"
repo1-retention-diff: "14"
repo1-path: /pgbackrest/production-db/repo1
repo1-s3-uri-style: path
repos:
- name: repo1
schedules:
full: "0 2 * * 0" # Sunday 2am
differential: "0 2 * * 1-6" # Mon-Sat 2am
incremental: "0 */4 * * *" # Every 4 hours
s3:
bucket: mycompany-pg-backups
endpoint: s3.eu-west-1.amazonaws.com
region: eu-west-1
proxy:
pgBouncer:
image: registry.crunchydata.com/crunchydata/crunchy-pgbouncer:ubi8-1.23.1-0
replicas: 2
resources:
requests:
cpu: 500m
memory: 256Mi
limits:
cpu: "1"
memory: 512Mi
config:
global:
pool_mode: transaction
max_client_conn: "1000"
default_pool_size: "25"
min_pool_size: "5"
reserve_pool_size: "5"
reserve_pool_timeout: "3"
monitoring:
pgmonitor:
exporter:
image: registry.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi8-0.16.0-0
resources:
requests:
cpu: 100m
memory: 128Mi
patroni:
dynamicConfiguration:
synchronous_mode: true
postgresql:
parameters:
shared_buffers: 2GB
effective_cache_size: 6GB
work_mem: 32MB
maintenance_work_mem: 512MB
max_connections: 200
wal_buffers: 64MB
checkpoint_completion_target: 0.9
random_page_cost: 1.1
effective_io_concurrency: 200
min_wal_size: 1GB
max_wal_size: 4GB
max_worker_processes: 8
max_parallel_workers_per_gather: 4
max_parallel_workers: 8
log_min_duration_statement: 500
log_checkpoints: "on"
log_connections: "on"
log_disconnections: "on"
log_lock_waits: "on"
users:
- name: appuser
databases: ["appdb"]
options: "NOSUPERUSER"
- name: readonly
databases: ["appdb"]
options: "NOSUPERUSER"
databaseInitSQL:
name: init-sql-configmap
key: init.sql
This specification creates a three-instance PostgreSQL 16 cluster with separate data and WAL volumes, S3-backed backups on a full/differential/incremental schedule, PgBouncer connection pooling in transaction mode, Prometheus metrics export, and Patroni synchronous replication. The pod anti-affinity rule ensures all three instances land on different Kubernetes nodes.
Patroni-Based HA and Automatic Failover
Patroni is the HA framework embedded in every PGO-managed PostgreSQL pod. It uses the Kubernetes API as its distributed configuration store (DCS) — no separate etcd or ZooKeeper cluster is required. Patroni continuously monitors the health of the PostgreSQL primary and replicas. When the primary becomes unresponsive, Patroni initiates an automatic failover: it promotes the most up-to-date replica to primary and reconfigures the remaining replicas to follow the new leader.
The failover process in PGO works as follows. Patroni on each pod holds a leader lock in Kubernetes (via Endpoints or ConfigMap objects). The current primary must renew this lock at a configurable interval (default 10 seconds TTL, 3 second loop wait). If the primary fails to renew — because it crashed, the node died, or the network partitioned it — a replica that is closest to the primary's WAL position will acquire the lock and promote itself. The entire process typically completes in 10 to 30 seconds.
Synchronous replication mode, enabled via synchronous_mode: true in the Patroni configuration, guarantees zero data loss (RPO = 0) at the cost of slightly higher write latency. In synchronous mode, a transaction is not acknowledged to the client until at least one replica has confirmed receiving the WAL. If no synchronous replica is available, Patroni temporarily disables synchronous mode to maintain availability — you can override this with synchronous_mode_strict: true if you prefer to sacrifice availability for consistency.
# Check Patroni cluster status
kubectl exec -it production-db-pgha1-0 -n databases -- patronictl list
+----------------------------+-------------------+---------+---------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+----------------------------+-------------------+---------+---------+----+-----------+
| production-db-pgha1-0 | 10.244.0.15 | Leader | running | 3 | |
| production-db-pgha1-1 | 10.244.1.22 | Replica | running | 3 | 0 |
| production-db-pgha1-2 | 10.244.2.18 | Replica | running | 3 | 0 |
+----------------------------+-------------------+---------+---------+----+-----------+
# Manual switchover (planned, zero downtime)
kubectl exec -it production-db-pgha1-0 -n databases -- \
patronictl switchover --master production-db-pgha1-0 --candidate production-db-pgha1-1 --force
# Manual failover (forced, for emergencies)
kubectl exec -it production-db-pgha1-1 -n databases -- \
patronictl failover --candidate production-db-pgha1-1 --force
PGO automatically configures the Kubernetes Services to track the Patroni leader. The production-db-primary service always points to whichever pod currently holds the leader lock, so applications connecting through this service experience seamless failover with only a brief connection reset.
pgBackRest Backup and Restore
pgBackRest is the backup engine integrated into PGO. It supports three backup types — full, differential, and incremental — plus continuous WAL archiving for point-in-time recovery (PITR). Understanding how these fit together is essential for designing a backup strategy that balances storage cost, backup speed, and recovery time.
A full backup copies the entire PostgreSQL data directory and is the baseline for all other backup types. A differential backup copies only pages that changed since the last full backup. An incremental backup copies only pages that changed since the last backup of any type. During restore, pgBackRest automatically chains the required backups together — for example, restoring from an incremental requires the incremental plus the prior differential (or full) plus the full backup, plus any WAL segments needed to reach the target recovery point.
Backup Schedule Configuration
The backup schedule is defined in the repos section of the pgBackRest configuration within the PostgresCluster spec. A solid production schedule typically runs a weekly full, daily differential, and more frequent incremental backups.
backups:
pgbackrest:
global:
repo1-retention-full: "4" # Keep 4 full backups
repo1-retention-full-type: count
repo1-retention-diff: "14" # Keep 14 differentials
repos:
- name: repo1
schedules:
full: "0 2 * * 0" # Weekly full: Sunday 2am
differential: "0 2 * * 1-6" # Daily diff: Mon-Sat 2am
incremental: "0 */6 * * *" # Every 6 hours
s3:
bucket: mycompany-pg-backups
endpoint: s3.eu-west-1.amazonaws.com
region: eu-west-1
Manual Backup and Restore
# Trigger a manual full backup
kubectl annotate postgrescluster production-db -n databases \
postgres-operator.crunchydata.com/pgbackrest-backup="$(date +%s)" \
--overwrite
# Check backup status
kubectl exec -it production-db-pgha1-0 -n databases -- \
pgbackrest info --stanza=db
# Restore to a specific point in time
# First, update the PostgresCluster spec:
spec:
backups:
pgbackrest:
restore:
enabled: true
repoName: repo1
options:
- --type=time
- --target="2026-04-12 08:30:00+00"
- --target-action=promote
# Apply the restore annotation
kubectl annotate postgrescluster production-db -n databases \
postgres-operator.crunchydata.com/pgbackrest-restore="$(date +%s)" \
--overwrite
During a PITR restore, PGO shuts down all PostgreSQL instances, restores from the nearest full or differential backup, replays WAL segments up to the target time, and then starts the cluster. The entire operation is orchestrated by the operator — no manual intervention on individual pods is needed.
PgBouncer Connection Pooling
PostgreSQL creates a new process for every client connection. At scale — hundreds or thousands of application pods each maintaining connection pools — this model breaks down. PgBouncer sits between your application and PostgreSQL, multiplexing many client connections over a smaller number of server connections.
PGO deploys PgBouncer as a separate Deployment with its own Service. The production-db-pgbouncer service is what your applications should connect to, not the PostgreSQL primary service directly. PgBouncer supports three pool modes:
- Session pooling — a server connection is assigned to a client for the lifetime of the client connection. Safest, but least efficient.
- Transaction pooling — a server connection is assigned only for the duration of a transaction. Most efficient for web workloads. This is the recommended default.
- Statement pooling — a server connection is assigned for a single statement. Only works for simple, non-transactional workloads.
proxy:
pgBouncer:
replicas: 3
config:
global:
pool_mode: transaction
max_client_conn: "2000"
default_pool_size: "30"
min_pool_size: "10"
reserve_pool_size: "10"
reserve_pool_timeout: "3"
server_idle_timeout: "300"
server_lifetime: "3600"
client_idle_timeout: "600"
tcp_keepalive: "1"
tcp_keepidle: "30"
tcp_keepintvl: "10"
tcp_keepcnt: "3"
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/role: pgbouncer
Note the tcp_keepalive settings — these are critical when PgBouncer runs behind a cloud load balancer or Kubernetes Service. Without aggressive keepalives, idle connections may be silently dropped by intermediate network components, causing application errors on the next query attempt.
pgMonitor and Prometheus/Grafana Monitoring
PGO's monitoring integration deploys a crunchy-postgres-exporter sidecar in every PostgreSQL pod. This exporter scrapes PostgreSQL's internal statistics views and exposes them as Prometheus metrics on port 9187. The exporter covers over 150 metrics out of the box, including connections, replication lag, transaction rates, cache hit ratios, table and index statistics, lock contention, and WAL generation rates.
# ServiceMonitor for Prometheus Operator auto-discovery
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: postgres-monitoring
namespace: databases
labels:
release: kube-prometheus-stack
spec:
selector:
matchLabels:
postgres-operator.crunchydata.com/cluster: production-db
postgres-operator.crunchydata.com/crunchy-postgres-exporter: "true"
endpoints:
- port: exporter
interval: 15s
scrapeTimeout: 10s
Crunchy Data provides a set of pre-built Grafana dashboards that you can import directly. These cover PostgreSQL overview, replication status, pgBackRest backup status, PgBouncer statistics, and pod-level resource usage. Import them via Grafana's dashboard provisioning or manually from the Crunchy Data examples repository.
# Install Grafana dashboards as ConfigMaps
kubectl create configmap grafana-pg-overview \
--from-file=pg-overview.json=dashboards/pg-overview.json \
-n monitoring
kubectl label configmap grafana-pg-overview grafana_dashboard=1 -n monitoring
Critical Alert Rules for PostgreSQL
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: postgres-alerts
namespace: databases
labels:
release: kube-prometheus-stack
spec:
groups:
- name: postgresql-health
rules:
- alert: PostgresReplicationLagHigh
expr: ccp_replication_lag_size_bytes > 104857600
for: 5m
labels:
severity: warning
annotations:
summary: "Replica {{ $labels.pod }} lag exceeds 100MB"
- alert: PostgresConnectionsExhausted
expr: ccp_connection_stats_active / ccp_connection_stats_max_connections > 0.85
for: 2m
labels:
severity: critical
annotations:
summary: "PostgreSQL connections above 85% capacity"
- alert: PostgresDeadlockDetected
expr: rate(ccp_stat_database_deadlocks[5m]) > 0
for: 1m
labels:
severity: warning
annotations:
summary: "Deadlocks detected in {{ $labels.datname }}"
- alert: PostgresCacheHitRatioLow
expr: ccp_stat_database_blks_hit / (ccp_stat_database_blks_hit + ccp_stat_database_blks_read) < 0.95
for: 10m
labels:
severity: warning
annotations:
summary: "Cache hit ratio below 95% for {{ $labels.datname }}"
- alert: PgBackRestStaleBackup
expr: time() - ccp_backrest_last_full_backup_time_since_completion_seconds > 604800
for: 1h
labels:
severity: critical
annotations:
summary: "No full backup in the last 7 days"
AWS EKS Deployment
Deploying PGO on AWS EKS requires configuring the EBS CSI driver for persistent volumes and IAM Roles for Service Accounts (IRSA) for S3 backup access. This approach avoids storing long-lived AWS credentials in Kubernetes secrets.
# Install the EBS CSI driver (required for gp3 volumes)
eksctl create addon --name aws-ebs-csi-driver --cluster my-cluster --region eu-west-1 \
--service-account-role-arn arn:aws:iam::111122223333:role/AmazonEKS_EBS_CSI_DriverRole
# Create a gp3 StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp3-csi
provisioner: ebs.csi.aws.com
parameters:
type: gp3
iops: "3000"
throughput: "125"
encrypted: "true"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
# IAM policy for pgBackRest S3 access
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::mycompany-pg-backups",
"arn:aws:s3:::mycompany-pg-backups/*"
]
}
]
}
# Create an IRSA-enabled service account
eksctl create iamserviceaccount \
--name pgbackrest-sa \
--namespace databases \
--cluster my-cluster \
--attach-policy-arn arn:aws:iam::111122223333:policy/PGBackRestS3Policy \
--approve
In the PostgresCluster spec, reference the S3 bucket and region. PGO will use the pod's service account credentials (via IRSA) automatically — no explicit access keys needed.
backups:
pgbackrest:
repos:
- name: repo1
s3:
bucket: mycompany-pg-backups
endpoint: s3.eu-west-1.amazonaws.com
region: eu-west-1
For multi-AZ resilience, ensure your EKS node groups span at least three availability zones and use the topology spread constraints shown earlier to distribute PostgreSQL pods across them.
Azure AKS Deployment
Azure AKS uses managed disks for persistent storage and Azure Blob Storage for pgBackRest backups. The recommended storage class uses Premium SSD v2 or Premium LRS for database workloads.
# StorageClass for Azure Premium SSD
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: azure-premium-ssd
provisioner: disk.csi.azure.com
parameters:
skuName: Premium_LRS
kind: Managed
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
---
# pgBackRest with Azure Blob Storage
backups:
pgbackrest:
configuration:
- secret:
name: pgbackrest-azure-creds
global:
repo1-azure-account: mystorageaccount
repo1-retention-full: "7"
repos:
- name: repo1
schedules:
full: "0 2 * * 0"
differential: "0 2 * * 1-6"
azure:
container: pg-backups
---
# Secret with Azure credentials
apiVersion: v1
kind: Secret
metadata:
name: pgbackrest-azure-creds
namespace: databases
stringData:
azure.conf: |
[global]
repo1-azure-key=BASE64_STORAGE_ACCOUNT_KEY
For Workload Identity (the Azure equivalent of AWS IRSA), configure the AKS cluster with OIDC issuer and create a federated identity credential for the pgBackRest service account. This eliminates the need for storage account keys in secrets.
GCP GKE Deployment
GKE uses Persistent Disk (pd-ssd) for storage and GCS for pgBackRest backups. GKE Workload Identity maps Kubernetes service accounts to Google Cloud service accounts for secure, keyless authentication.
# StorageClass for GKE SSD Persistent Disk
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: pd-ssd
provisioner: pd.csi.storage.gke.io
parameters:
type: pd-ssd
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
---
# pgBackRest with GCS
backups:
pgbackrest:
configuration:
- secret:
name: pgbackrest-gcs-creds
repos:
- name: repo1
gcs:
bucket: mycompany-pg-backups
---
# GCS service account key secret
apiVersion: v1
kind: Secret
metadata:
name: pgbackrest-gcs-creds
namespace: databases
stringData:
gcs-key.json: |
{
"type": "service_account",
"project_id": "my-project",
...
}
# Enable Workload Identity on GKE
gcloud container clusters update my-cluster \
--workload-pool=my-project.svc.id.goog
# Create and bind IAM service account
gcloud iam service-accounts create pgbackrest-gcs \
--display-name="pgBackRest GCS Access"
gcloud projects add-iam-policy-binding my-project \
--member="serviceAccount:pgbackrest-gcs@my-project.iam.gserviceaccount.com" \
--role="roles/storage.objectAdmin"
gcloud iam service-accounts add-iam-policy-binding \
pgbackrest-gcs@my-project.iam.gserviceaccount.com \
--role="roles/iam.workloadIdentityUser" \
--member="serviceAccount:my-project.svc.id.goog[databases/production-db-pgbackrest]"
Multi-Region Disaster Recovery
PGO supports standby clusters for cross-region disaster recovery. A standby cluster continuously replays WAL from the primary cluster's pgBackRest repository, maintaining a warm copy that can be promoted to an independent primary in the event of a regional failure.
# Standby cluster in Region B
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: production-db-standby
namespace: databases
spec:
postgresVersion: 16
image: registry.crunchydata.com/crunchydata/crunchy-postgres:ubi8-16.6-1
standby:
enabled: true
repoName: repo1
instances:
- name: pgha1
replicas: 2
dataVolumeClaimSpec:
storageClassName: gp3-csi
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 100Gi
backups:
pgbackrest:
image: registry.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.54.1-0
configuration:
- secret:
name: pgbackrest-s3-creds
repos:
- name: repo1
s3:
bucket: mycompany-pg-backups
endpoint: s3.us-east-1.amazonaws.com
region: us-east-1
To promote the standby cluster during a disaster, set standby.enabled: false in the spec and apply the change. PGO promotes the standby to an independent primary cluster. This promotion is irreversible — you will need to rebuild the replication relationship from scratch after the original primary region recovers.
Bare Metal k3s/Rancher Deployment
Running PGO on bare metal k3s with Rancher management requires careful attention to storage and networking, since you do not have cloud-provider managed disks or load balancers.
Longhorn Storage
Longhorn is a lightweight, distributed block storage system for Kubernetes that is ideal for bare metal environments. It replicates volumes across multiple nodes for durability and supports snapshots, backups, and volume expansion.
# Install Longhorn via Helm
helm repo add longhorn https://charts.longhorn.io
helm repo update
helm install longhorn longhorn/longhorn \
--namespace longhorn-system \
--create-namespace \
--set defaultSettings.defaultReplicaCount=3 \
--set defaultSettings.storageOverProvisioningPercentage=100 \
--set defaultSettings.storageMinimalAvailablePercentage=15
---
# Longhorn StorageClass for PostgreSQL
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: longhorn-pg
provisioner: driver.longhorn.io
parameters:
numberOfReplicas: "3"
staleReplicaTimeout: "2880"
fsType: ext4
dataLocality: best-effort
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Retain
MetalLB for Load Balancing
# Install MetalLB
helm repo add metallb https://metallb.github.io/metallb
helm install metallb metallb/metallb --namespace metallb-system --create-namespace
---
# Configure IP address pool
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: pg-pool
namespace: metallb-system
spec:
addresses:
- 192.168.1.200-192.168.1.210
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: pg-l2
namespace: metallb-system
spec:
ipAddressPools:
- pg-pool
With MetalLB configured, PGO's Services of type LoadBalancer will receive IP addresses from your pool, making the PostgreSQL primary and PgBouncer endpoints directly reachable from your network without manual port forwarding.
pgBackRest with Local PVC or NFS for Bare Metal
# For environments without S3/GCS/Azure Blob, use a PVC-based repo
backups:
pgbackrest:
repos:
- name: repo1
schedules:
full: "0 2 * * 0"
differential: "0 2 * * 1-6"
volume:
volumeClaimSpec:
storageClassName: longhorn-pg
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 200Gi
For air-gapped environments, you can also configure a secondary pgBackRest repository that ships backups to an NFS share or a MinIO instance running on-premises, providing an off-cluster backup copy without cloud dependencies.
TLS/SSL Encryption Configuration
PGO generates self-signed TLS certificates for all internal communication by default — PostgreSQL instances, replication, pgBackRest, and PgBouncer all communicate over encrypted channels without any manual certificate management. However, for production environments you typically want to use certificates signed by your organization's CA.
# Create a TLS secret with your CA-signed certificates
kubectl create secret tls production-db-tls \
--cert=server.crt \
--key=server.key \
-n databases
kubectl create secret generic production-db-ca \
--from-file=ca.crt=ca.crt \
-n databases
---
# Reference in PostgresCluster spec
spec:
customTLSSecret:
name: production-db-tls
customReplicationTLSSecret:
name: production-db-replication-tls
PGO configures PostgreSQL with ssl = on and sets the appropriate ssl_cert_file, ssl_key_file, and ssl_ca_file parameters. PgBouncer is similarly configured to require TLS for client connections and to use TLS when connecting to PostgreSQL backends.
# Enforce TLS for all client connections via pg_hba.conf
spec:
patroni:
dynamicConfiguration:
postgresql:
pg_hba:
- hostssl all all 0.0.0.0/0 scram-sha-256
- hostssl replication all 0.0.0.0/0 scram-sha-256
Custom PostgreSQL Configuration
PGO exposes PostgreSQL configuration through the Patroni dynamic configuration section. This is the recommended approach because Patroni ensures all instances maintain consistent configuration and handles restarts when required.
patroni:
dynamicConfiguration:
postgresql:
parameters:
# Memory
shared_buffers: 4GB
effective_cache_size: 12GB
work_mem: 64MB
maintenance_work_mem: 1GB
huge_pages: "try"
# WAL
wal_buffers: 128MB
min_wal_size: 2GB
max_wal_size: 8GB
checkpoint_completion_target: 0.9
wal_compression: lz4
# Query Planning
random_page_cost: 1.1
effective_io_concurrency: 200
default_statistics_target: 200
# Parallelism
max_worker_processes: 16
max_parallel_workers_per_gather: 4
max_parallel_workers: 8
max_parallel_maintenance_workers: 4
# Connections
max_connections: 200
idle_in_transaction_session_timeout: 60000
statement_timeout: 300000
# Logging
log_min_duration_statement: 250
log_checkpoints: "on"
log_connections: "on"
log_disconnections: "on"
log_lock_waits: "on"
log_temp_files: 0
log_autovacuum_min_duration: 0
# Autovacuum Tuning
autovacuum_max_workers: 5
autovacuum_naptime: 30
autovacuum_vacuum_cost_limit: 800
autovacuum_vacuum_scale_factor: 0.02
autovacuum_analyze_scale_factor: 0.01
These parameters assume a node with 16 CPU cores and 32 GB RAM dedicated to PostgreSQL. Adjust shared_buffers to roughly 25% of available RAM and effective_cache_size to roughly 75%. The WAL and checkpoint settings are tuned for write-heavy workloads — reduce max_wal_size for read-heavy systems where checkpoint frequency matters less.
User and Database Management
PGO manages PostgreSQL users and databases declaratively through the users section of the PostgresCluster spec. When you add a user, PGO creates the role in PostgreSQL, generates a random password, and stores the credentials in a Kubernetes Secret.
users:
- name: appuser
databases: ["appdb"]
options: "NOSUPERUSER NOCREATEDB NOCREATEROLE"
- name: readonly
databases: ["appdb"]
options: "NOSUPERUSER NOCREATEDB NOCREATEROLE"
- name: migrations
databases: ["appdb"]
options: "NOSUPERUSER CREATEDB"
- name: monitoring
databases: ["postgres"]
options: "NOSUPERUSER LOGIN"
---
# Access credentials from the generated secret
kubectl get secret production-db-pguser-appuser -n databases -o jsonpath='{.data.password}' | base64 -d
# The secret also contains a full connection URI
kubectl get secret production-db-pguser-appuser -n databases -o jsonpath='{.data.uri}' | base64 -d
# postgresql://appuser:PASSWORD@production-db-primary.databases.svc:5432/appdb
For granting read-only access, use the databaseInitSQL feature to run SQL at cluster creation that creates a read-only role and grants it SELECT on all tables.
# init.sql ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: init-sql-configmap
namespace: databases
data:
init.sql: |
-- Read-only role for reporting users
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO readonly;
GRANT CONNECT ON DATABASE appdb TO readonly;
GRANT USAGE ON SCHEMA public TO readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO readonly;
-- Extensions
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
CREATE EXTENSION IF NOT EXISTS pgcrypto;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
Rolling Updates and Major Version Upgrades
PGO handles minor version upgrades through rolling updates. When you change the image tag in the PostgresCluster spec to a newer patch release, PGO updates the instances one at a time, starting with replicas and finishing with the primary (which triggers a Patroni switchover to minimize downtime).
# Minor version upgrade: change the image tag
spec:
image: registry.crunchydata.com/crunchydata/crunchy-postgres:ubi8-16.7-0
Major version upgrades (e.g., PostgreSQL 15 to 16) require pg_upgrade, which PGO orchestrates through a separate PGUpgrade CRD.
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PGUpgrade
metadata:
name: production-db-upgrade
namespace: databases
spec:
postgresClusterName: production-db
fromPostgresVersion: 15
toPostgresVersion: 16
image: registry.crunchydata.com/crunchydata/crunchy-upgrade:ubi8-5.7.2-0
The upgrade process shuts down the existing cluster, runs pg_upgrade --link to perform an in-place upgrade using hard links (minimizing data copying), verifies the upgrade, and starts the cluster on the new version. Always take a full backup before starting a major version upgrade and test the procedure on a non-production cluster first.
# Pre-upgrade checklist
# 1. Take a full backup
kubectl annotate postgrescluster production-db -n databases \
postgres-operator.crunchydata.com/pgbackrest-backup="$(date +%s)" \
--overwrite
# 2. Verify backup completed
kubectl exec -it production-db-pgha1-0 -n databases -- pgbackrest info --stanza=db
# 3. Shutdown the cluster (set replicas to 0 or use PGO shutdown annotation)
kubectl annotate postgrescluster production-db -n databases \
postgres-operator.crunchydata.com/shutdown="$(date +%s)" \
--overwrite
# 4. Apply the PGUpgrade resource
kubectl apply -f pg-upgrade.yaml
# 5. Update the PostgresCluster spec with the new version and image
# 6. Remove the shutdown annotation to start the upgraded cluster
Production Tuning Recommendations
Running PGO in production requires attention to several operational areas beyond the initial deployment.
Storage Performance
- Separate data and WAL volumes: Always use
walVolumeClaimSpecto place WAL on a dedicated PVC. This prevents write-heavy WAL activity from contending with data I/O. - Use high-IOPS storage: gp3 (AWS), Premium SSD v2 (Azure), pd-ssd (GCP), or NVMe-backed Longhorn on bare metal. PostgreSQL's random I/O patterns demand low-latency storage.
- Volume expansion: Ensure your StorageClass has
allowVolumeExpansion: true. PGO can expand PVCs non-disruptively on supported storage providers.
Pod Disruption Budgets
PGO automatically creates PDBs for your PostgreSQL instances, but verify they are appropriate for your HA requirements.
# Check PDBs
kubectl get pdb -n databases
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
production-db-pgha1 1 N/A 2 7d
Resource Requests and Limits
- Set memory requests equal to limits for database pods to prevent OOM kills and ensure guaranteed QoS class.
- Set CPU requests conservatively and limits higher to allow bursting during vacuum or maintenance operations.
- Monitor actual usage via pgMonitor metrics and adjust quarterly.
Connection Management
- Always connect through PgBouncer, not directly to PostgreSQL.
- Set
max_connectionsin PostgreSQL to a value that accounts for PgBouncer's pool size plus system connections (replication, monitoring, superuser). - Use transaction pooling mode for web applications. Switch to session pooling only if your application uses prepared statements or session-level state (e.g.,
SETcommands, temporary tables).
Backup Validation
- Regularly test restores by creating a clone cluster from backup and running application-level smoke tests.
- Monitor the
PgBackRestStaleBackupalert to ensure backups are completing on schedule. - Validate PITR by restoring to specific timestamps and verifying data consistency.
# Clone cluster for backup validation
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: backup-validation
namespace: databases
spec:
postgresVersion: 16
image: registry.crunchydata.com/crunchydata/crunchy-postgres:ubi8-16.6-1
dataSource:
postgresCluster:
clusterName: production-db
repoName: repo1
options:
- --type=time
- --target="2026-04-12 06:00:00+00"
instances:
- name: pgha1
replicas: 1
dataVolumeClaimSpec:
storageClassName: gp3-csi
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 100Gi
backups:
pgbackrest:
repos:
- name: repo1
volume:
volumeClaimSpec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 50Gi
Network Policies
# Restrict PostgreSQL traffic to known namespaces
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: postgres-network-policy
namespace: databases
spec:
podSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: production-db
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
app-access: database
- podSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: production-db
ports:
- port: 5432
protocol: TCP
- port: 9187
protocol: TCP
Monitoring Checklist
- Replication lag: Alert when any replica exceeds 100MB or 60 seconds of lag.
- Connection saturation: Alert when active connections exceed 80% of
max_connections. - Transaction rate anomalies: Baseline your TPS and alert on significant deviations.
- Disk usage: Alert at 70% and 85% thresholds with volume expansion runbooks.
- Backup freshness: Alert when the last full backup is older than your RPO window.
- Lock contention: Alert on long-held locks (> 30 seconds) that may indicate application bugs.
- Autovacuum health: Alert when tables have not been vacuumed in over 24 hours.
Disaster Recovery Runbook
A disaster recovery plan is only useful if it has been tested. The following runbook outlines the key procedures for recovering from common failure scenarios.
Single Pod Failure
Patroni and Kubernetes handle this automatically. If the primary pod crashes, Patroni promotes a replica within 10-30 seconds. Kubernetes restarts the failed pod, which rejoins as a replica.
Single Node Failure
If a node running a PostgreSQL pod dies, Kubernetes reschedules the pod on a healthy node. The pod attaches to its existing PVC (if the storage is network-attached) or restores from backup (if local storage was used). Pod anti-affinity rules ensure the remaining instances continue serving traffic.
Complete Cluster Loss
If the entire Kubernetes cluster is lost, deploy a new cluster, install PGO, and create a new PostgresCluster with a dataSource pointing to the backup repository. PGO restores the latest backup and replays WAL to the most recent available point.
Regional Failover
If the primary region is lost, promote the standby cluster by setting standby.enabled: false. Update your DNS or load balancer to point to the new primary region. Once the original region recovers, you can rebuild the standby relationship in reverse.
# Promote standby cluster
kubectl patch postgrescluster production-db-standby -n databases --type merge \
-p '{"spec":{"standby":{"enabled":false}}}'
# Verify the cluster is now an independent primary
kubectl exec -it production-db-standby-pgha1-0 -n databases -- patronictl list
Operational Commands Quick Reference
# Cluster status
kubectl get postgrescluster -n databases
kubectl describe postgrescluster production-db -n databases
# Patroni status
kubectl exec -it production-db-pgha1-0 -n databases -- patronictl list
kubectl exec -it production-db-pgha1-0 -n databases -- patronictl history
# Connect to PostgreSQL
kubectl exec -it production-db-pgha1-0 -n databases -- psql -U postgres
# Backup info
kubectl exec -it production-db-pgha1-0 -n databases -- pgbackrest info --stanza=db
# Manual backup
kubectl annotate postgrescluster production-db -n databases \
postgres-operator.crunchydata.com/pgbackrest-backup="$(date +%s)" --overwrite
# Restart PostgreSQL (rolling)
kubectl annotate postgrescluster production-db -n databases \
postgres-operator.crunchydata.com/restart="$(date +%s)" --overwrite
# Logs
kubectl logs production-db-pgha1-0 -n databases -c database
kubectl logs production-db-pgha1-0 -n databases -c pgbackrest
kubectl logs production-db-pgha1-0 -n databases -c exporter
# PgBouncer status
kubectl exec -it $(kubectl get pod -l postgres-operator.crunchydata.com/role=pgbouncer -n databases -o name | head -1) -n databases -- psql -U pgbouncer pgbouncer -c "SHOW POOLS;"
# Scale replicas
kubectl patch postgrescluster production-db -n databases --type merge \
-p '{"spec":{"instances":[{"name":"pgha1","replicas":5}]}}'
Conclusion
Crunchy Data PGO transforms PostgreSQL on Kubernetes from an operational burden into a manageable, declarative system. The PostgresCluster CRD captures the entire deployment — instances, replication, backup, pooling, monitoring, TLS — in a single version-controlled resource. Patroni provides battle-tested automatic failover. pgBackRest delivers enterprise-grade backup with point-in-time recovery to any cloud object store. PgBouncer handles the connection pooling that PostgreSQL's process-per-connection model demands at scale. And pgMonitor feeds the metrics you need into Prometheus and Grafana for operational visibility.
The deployment patterns across AWS EKS, Azure AKS, GCP GKE, and bare metal k3s share the same core PostgresCluster spec — what changes is the storage class, backup repository configuration, and networking layer. This consistency is the real value of an operator-based approach: your team learns one tool, one operational model, and one set of runbooks that work everywhere.
Start with a three-instance cluster, PgBouncer in transaction pooling mode, a weekly full plus daily differential backup schedule, and the core Prometheus alerts. Validate your backup restore procedure on day one — not when you first need it. Expand to multi-region standby clusters, synchronous replication, and advanced tuning as your availability requirements and operational maturity grow. The operator handles the mechanics; your responsibility is to understand the architecture well enough to make the right tradeoffs for your workload.