CloudNative-PG: running PostgreSQL in K8s

Published in

Digitalis.io Blog

4 min readMay 12, 2023

Introduction

I’ve been working for the last 6 months on a pretty cool project requiring huge amounts of data running across several Apache Cassandra clusters with AxonOps. Some databases are running in Kubernetes and some others are outside of it whilst all the clients are in K8s.

There is one thing where distributed databases such as Apache Cassandra don’t do well: counters. Imagine things like click counts, votes, views, etc. Apache Cassandra does support counters but they’re not free of problems. My recommendation, avoid them if you can.

For this reason, I was tasked with running a PostgreSQL database in Kubernetes.

But running data applications in Kubernetes it’s not always easy. You usually need a good operator to help with managing the database by performing critical operations such as failover and backups.

I have used already CrunchyData and StackGres operators. They’re both good but not perfect. One of my colleagues suggested having a look at CloudNative-PG so I put it to the test.

CloudNativePG is an open source operator designed to manage PostgreSQL workloads on any supported Kubernetes cluster running in private, public, hybrid, or multi-cloud environments. CloudNativePG adheres to DevOps principles and concepts such as declarative configuration and immutable infrastructure.

What’s to like

The first thing I noticed is they use immutable containers. I really like this feature because when dealing with data, security must the number one priority. Also, TLS is used by default with self-deployed certs and you can bring your own with cert-manager.

It uses barman for backup and restores and it supports PITR (point in time recovery). The configuration is very easy, it just needs object storage to store the backups (ie: S3, GCP, Azure, MinIO, etc)

  backup:
    barmanObjectStore:
      destinationPath: s3://cloudnative-pg-clusters/pg001
      s3Credentials:
        accessKeyId:
          name: backups-s3
          key: AWS_ACCESS_KEY_ID
        secretAccessKey:
          name: backups-s3
          key: AWS_SECRET_ACCESS_KEY

You can create replica clusters with a couple of configuration lines for Disaster Recovery.

When talking about data you always need to make sure the data is replicated to at least one more location. If you are in the cloud, what you don’t need is for a region to disappear (in fact, I experience this first-hand during the Great Fire of London when we had to fail over to Europe).

https://www.reuters.com/technology/google-cloud-data-center-london-faces-outage-uks-hottest-day-2022-07-19/

The CloudNativePG operator allows you to create synchronous replication constraints to ensure they are spread across different AZ.

spec:
  instances: 3
  postgresql:
    syncReplicaElectionConstraint:
      enabled: true
      nodeLabelsAntiAffinity:
      - topology.kubernetes.io/zone

The default is to use asynchronous replication but you can also enable synchronous replication if this is what you need.

I really like the functionality to import databases. This is very useful when you have throwaway environments such as development and you want to quickly create new ones with all the data required in place.

The cnpg plugin for kubectl is the thing that brings all this great functionality together allowing you to perform operations such as backups, fencing and rolling restarts.

~$ kubectl cnpg backup -n pg001 pg001
~$ kubectl get backup/pg001-20230511084800 -n pg001
NAME                   AGE    CLUSTER   PHASE       ERROR
pg001-20230511084800   1m     pg001     completed

Once you have your cluster running, you need to be alerted when there is an issue. This means setting up some sort of monitoring. I have always used postgres_exporter for this but happily, CloudNativePG comes with all singing and dancing configs and you don’t need anything additional.

The project provides a nicely done Grafana dashboard and some sample Prometheus rules, all you need to get started.

What’s missing

I only found one feature missing I’d like to have though it’s not a big issue. The operator does not have the functionality to create an ingress endpoint for allowing access to the DB from outside K8s.

This is required for example if you have two Kubernetes clusters, one running the primary DB and the other with the replica.

The documentation provides some clues as to how to achieve this using the Nginx ingress. I’ve done it instead using Traefik with SNI similar to what I do with Kafka and it works well.

It would be good though if this was part of the operator like Strimzi does.

Conclusion

It’s very early days for me, I’ve only been running the cluster for a few days but for the moment, I haven’t found any major issues apart from a bug in the Prometheus replication lag metrics.

Replication lag metric seems to be broken in 1.19.1 · Issue #1814 · cloudnative-pg/cloudnative-pg

After updating from 1.19.0 to 1.19.1 I have constant alarms for cnpg_pg_replication_lag metric for all my db clusters…

github.com

I ran through a basic OAT (Operational Acceptance Testing) similar to what we do for our customers before we complete a deployment and it passed all the tests.

Overall, it looks like a powerful operator yet easy to use. I’m impressed.