Skip to content
This repository was archived by the owner on Jan 14, 2020. It is now read-only.

Commit 2be3bb2

Browse files
authored
Merge pull request #21 from gocardless/lawrence-improve-readme
Update README before open-sourcing
2 parents 2a7070f + 2cdea1d commit 2be3bb2

File tree

3 files changed

+311
-14
lines changed

3 files changed

+311
-14
lines changed

README.md

Lines changed: 303 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,304 @@
11
# pgsql-cluster-manager [![CircleCI](https://circleci.com/gh/gocardless/pgsql-cluster-manager.svg?style=svg&circle-token=38c8f4dc817216aa6a02b3bf67435fe2f1d72189)](https://circleci.com/gh/gocardless/pgsql-cluster-manager)
22

3-
https://paper.dropbox.com/doc/Postgres-Clustering-V2-d9N8n4cWuXZPeyTdeEpXw
3+
`pgsql-cluster-manager` extends a standard highly-available Postgres setup
4+
(managed by [Corosync](http://corosync.github.io/) and
5+
[Pacemaker](http://www.linux-ha.org/wiki/Pacemaker)) enabling its use in cloud
6+
environments where using using floating IPs to denote the primary node is
7+
difficult or impossible. In addition, `pgsql-cluster-manager` provides the
8+
ability to run zero-downtime migrations of the Postgres primary with a simple
9+
API trigger.
410

5-
## PGBouncer config
11+
See [Playground](#playground) for how to start a Dockerised three node Postgres
12+
cluster with `pgsql-cluster-manager`.
613

7-
We use lib/pq to connect to PGBouncer over the unix socket. Unfortunately lib/pq
8-
has issues when first establishing a connection to PGBouncer as it attempts to
9-
set the configuration parameters `extra_float_digits`, which PGBouncer doesn't
14+
- [Overview](#overview)
15+
- [Playground](#playground)
16+
- [Node Roles](#node-roles)
17+
- [Postgres Nodes](#postgres-nodes)
18+
- [App Nodes](#app-nodes)
19+
- [Zero-Downtime Migrations](#zero-downtime-migrations)
20+
- [Configuration](#configuration)
21+
- [Pacemaker](#pacemaker)
22+
- [PgBouncer](#pgbouncer)
23+
- [Development](#development)
24+
- [CircleCI](#circleci)
25+
- [Releasing](#releasing)
26+
27+
## Overview
28+
29+
GoCardless runs a highly available Postgres cluster using
30+
[Corosync](http://corosync.github.io/) and
31+
[Pacemaker](http://www.linux-ha.org/wiki/Pacemaker). Corosync provides an
32+
underlying quorum mechanism while pacemaker provides the ability to register
33+
plugins that can manage arbitrary services, detecting and recovering from node
34+
and service-level failures.
35+
36+
The typical Postgres setup with Corosync & Pacemaker uses a floating IP attached
37+
to the Postgres primary node. Clients connect to this IP, and during failover
38+
the IP is moved to the new primary. Managing portable IPs in Cloud providers
39+
such as AWS and GCP is more difficult than a classic data center, and so we
40+
built `pgsql-cluster-manager` to adapt our cluster for these environments.
41+
42+
`pgsql-cluster-manager` makes use of [etcd](https://github.com/coreos/etcd) to
43+
store cluster configuration, which can then be used by clients to connect to the
44+
appropriate node. We can view `pgsql-cluster-manager` as three distinct services
45+
which each conceptually 'manage' different components:
46+
47+
- `cluster` extracts cluster state from pacemaker and pushes to etcd
48+
- `proxy` ensures our Postgres proxy (PgBouncer) is reloaded with the current
49+
primary IP
50+
- `migration` controls a zero-downtime migration flow
51+
52+
### Playground
53+
54+
We have created a Dockerised sandbox environment that boots a three node
55+
Postgres cluster with the `pgsql-cluster-manager` services installed. We
56+
strongly recommend playing around in this environment to develop an
57+
understanding of how this setup works and to simulate failure situations
58+
(network partitions, node crashes, etc).
59+
60+
**It also helps to have this playground running while reading through the README,
61+
in order to try out the commands you see along the way.**
62+
63+
First install [Docker](https://docker.io/) and Golang >=1.9, then run:
64+
65+
```
66+
# Clone into your GOPATH
67+
$ git clone https://github.com/gocardless/pgsql-cluster-manager
68+
$ cd pgsql-cluster-manager
69+
$ make build-linux
70+
71+
$ cd docker/postgres-member && ./start
72+
Sending build context to Docker daemon 4.332 MB
73+
Step 1/16 : FROM gocardless/pgsql-cluster-manager
74+
...
75+
76+
root@pg01:/# crm_mon -Afr -1
77+
78+
Node Attributes:
79+
* Node pg01:
80+
+ Postgresql-data-status : STREAMING|SYNC
81+
+ Postgresql-status : HS:sync
82+
+ master-Postgresql : 100
83+
* Node pg02:
84+
+ Postgresql-data-status : STREAMING|POTENTIAL
85+
+ Postgresql-status : HS:potential
86+
+ master-Postgresql : -INFINITY
87+
* Node pg03:
88+
+ Postgresql-data-status : LATEST
89+
+ Postgresql-master-baseline : 0000000002000090
90+
+ Postgresql-status : PRI
91+
+ master-Postgresql : 1000
92+
93+
root@pg01:/# ping pg03 -c1 | head -n1
94+
PING pg03 (172.17.0.4) 56(84) bytes of data.
95+
96+
root@pg01:/# ETCDCTL_API=3 etcdctl get --prefix /
97+
/postgres/master
98+
172.17.0.4
99+
```
100+
101+
The [start](docker/postgres-member/start) script will boot three Postgres nodes
102+
with the appropriate configuration, and will start a full Postgres cluster. The
103+
script (for convenience) will enter you into a docker shell in `pg01`.
104+
Connecting to any of the other containers can be achieved with `docker exec -it
105+
pg0X /bin/bash`.
106+
107+
### Node Roles
108+
109+
The `pgsql-cluster-manager` services are expected to run on two types of
110+
machine: the nodes that are members of the Postgres cluster, and the machines
111+
that will host applications which will connect to the cluster.
112+
113+
![Two node types, Postgres and App machines](res/node_roles.svg)
114+
115+
To explain how this setup works, we'll use an example of three machines (`pg01`,
116+
`pg02`, `pg03`) to run the Postgres cluster and one machine (`app01`) to run our
117+
client application. To match a typical production environment, let's imagine we
118+
want to run a docker container on `app01` and have that container connect to our
119+
Postgres cluster, while being resilient to Postgres failover.
120+
121+
It's worth noting that our playground configures only nodes of the Postgres
122+
type, as this is sufficient to test out and play with the cluster. In production
123+
you'd run app nodes so that applications can connect to the local PgBouncer,
124+
which in turn knows how to route to the primary.
125+
126+
For playing around, it's totally fine to connect to one of the cluster nodes
127+
PgBouncers directly from your host machine.
128+
129+
#### Postgres Nodes
130+
131+
In this hypothetical world we've provisioned our Postgres boxes with corosync,
132+
pacemaker and Postgres, and additionally the following services:
133+
134+
- [PgBouncer](https://pgbouncer.github.io/) for connection pooling and proxying
135+
to the current primary
136+
- [etcd](https://github.com/coreos/etcd) as a queryable store of cluster state,
137+
connecting to provide a three node etcd cluster
138+
139+
We then run the `cluster` service as a daemon, which will continually query
140+
pacemaker to pull the current Postgres primary IP address and push this value to
141+
etcd. Once we're pushing this value to etcd, we can use the `proxy` service to
142+
subscribe to changes and update the local PgBouncer with the new value. We do
143+
this by provisioning a PgBouncer [configuration template file](
144+
docker/postgres-member/pgbouncer/pgbouncer.ini.template)
145+
that looks like the following:
146+
147+
```
148+
# /etc/pgbouncer/pgbouncer.ini.template
149+
150+
[databases]
151+
postgres = host={{.Host}} pool_size=10
152+
```
153+
154+
Whenever the `cluster` service pushes a new IP address to etcd, the `proxy`
155+
service will render this template and replace any `{{.Host}}` placeholder with
156+
the latest Postgres primary IP address, finally reloading PgBouncer to direct
157+
connections at the new primary.
158+
159+
We can verify that `cluster` is pushing the IP address by using `etcdctl` to
160+
inspect the contents of our etcd cluster. We should find the current Postgres
161+
primary IP address has been pushed to the key we have configured for
162+
`pgsql-cluster-manager`
163+
164+
```
165+
root@pg01:/$ ETCDCTL_API=3 etcdctl get --prefix /
166+
/postgres/master
167+
172.17.0.2
168+
```
169+
170+
#### App Nodes
171+
172+
We now have the Postgres nodes running PgBouncer proxies that live-update their
173+
configuration to point connections to the latest Postgres primary. Our aim is
174+
now to have app clients inside docker containers to connect to our Postgres
175+
cluster without having to introduce routing decisions into the client code.
176+
177+
To do this, we install PgBouncer onto `app01` and bind to the host's private
178+
interface. We then allow traffic from the docker network interface to the
179+
private interface on the host, so that containers can communicate with the
180+
PgBouncer on the host.
181+
182+
Finally we configure `app01`'s PgBouncer with a configuration template as we did
183+
with the Postgres machines, and run the `proxy` service to continually update
184+
PgBouncer to point at the latest primary. Containers then connect via the docker
185+
host IP to PgBouncer, which will transparently direct connections to the correct
186+
Postgres node.
187+
188+
```sh
189+
root@app01:/$ cat <EOF >/etc/pgbouncer/pgbouncer.ini.template
190+
[databases]
191+
postgres = host={{.Host}}
192+
EOF
193+
194+
root@app01:/$ service pgsql-cluster-manager-proxy start
195+
pgsql-cluster-manager-proxy start/running, process 6997
196+
197+
root@app01:/$ service pgbouncer start
198+
* Starting PgBouncer pgbouncer
199+
...done.
200+
201+
root@app01:/$ tail /var/log/pgsql-cluster-manager/proxy.log | grep HostChanger
202+
{"handler":"*pgbouncer.HostChanger","key":"/master","level":"info","message":"Triggering handler with initial etcd key value","timestamp":"2017-12-03T17:49:03+0000","value":"172.17.0.2"}
203+
204+
root@app01:/$ tail /var/log/postgresql/pgbouncer.log | grep "RELOAD"
205+
2017-12-03 17:49:03.167 16888 LOG RELOAD command issued
206+
207+
# Attempt to connect via the docker bridge IP
208+
root@app01:/$ docker run -it --rm jbergknoff/postgresql-client postgresql://[email protected]:6432/postgres
209+
Password:
210+
psql (9.6.5, server 9.4.14)
211+
Type "help" for help.
212+
213+
postgres=#
214+
```
215+
216+
### Zero-Downtime Migrations
217+
218+
It's inevitable over the lifetime of a database cluster that machines will need
219+
upgrading, and services restarting. It's not acceptable for such routine tasks
220+
to require downtime, so `pgsql-cluster-manager` provides an API to trigger
221+
migrations of the Postgres primary without disrupting database clients.
222+
223+
This API is served by the supervise `migration` service, which should be run on
224+
all the Postgres nodes participating in the cluster. It's important to note that
225+
this flow is only supported when all database clients are using PgBouncer
226+
transaction pools in order to support pausing connections. Any clients that use
227+
session pools will need to be turned off for the duration of the migration.
228+
229+
1. Acquire lock in etcd (ensuring only one migration takes place at a time)
230+
2. Pause all PgBouncer pools on Postgres nodes
231+
3. Instruct Pacemaker to perform migration of primary to sync node
232+
4. Once the sync node is serving traffic as a primary, resume PgBouncer pools
233+
5. Release etcd lock
234+
235+
As the primary moves machine, the supervise `cluster` service will push the new
236+
IP address to etcd. The supervise `proxy` services running in the Postgres and
237+
App nodes will detect this change and update PgBouncer to point at the new
238+
primary IP, while the migration flow will detect this change in step (4) and
239+
resume PgBouncer to allow queries to start once more.
240+
241+
```
242+
root@pg01:/$ pgsql-cluster-manager --config-file /etc/pgsql-cluster-manager/config.toml migrate
243+
INFO[0000] Loaded config configFile=/etc/pgsql-cluster-manager/config.toml
244+
INFO[0000] Health checking clients
245+
INFO[0000] Acquiring etcd migration lock
246+
INFO[0000] Pausing all clients
247+
INFO[0000] Running crm resource migrate
248+
INFO[0000] Watching for etcd key to update with master IP address key=/master target=172.17.0.2
249+
INFO[0006] Successfully migrated! master=pg01
250+
INFO[0006] Running crm resource unmigrate
251+
INFO[0007] Releasing etcd migration lock
252+
```
253+
254+
This flow is subject to several timeouts that should be tuned to match your
255+
pacemaker cluster settings. See `pgsql-cluster-manager migrate --help` for an
256+
explanation of each timeout and how it affects the migration. This flow can be
257+
run from anywhere that has access to the etcd and Postgres migration API.
258+
259+
The Postgres node that was originally the primary is now turned off, and won't
260+
rejoin the cluster until the lockfile is removed. You can bring the node back
261+
into the cluster by doing the following:
262+
263+
```
264+
root@pg02:/$ rm /var/lib/postgresql/9.4/tmp/PGSQL.lock
265+
root@pg02:/$ crm resource cleanup msPostgresql
266+
```
267+
268+
## Configuration
269+
270+
We recommand configuring `pgsql-cluster-manager` using a TOML configuration
271+
file. You can generate a sample configuration file with the default values for
272+
each paramter by running the following:
273+
274+
```
275+
$ pgsql-cluster-manager show-config >/etc/pgsql-cluster-manager/config.toml
276+
```
277+
278+
### Pacemaker
279+
280+
The test environment is a good basis for configuring pacemaker with the pgsql
281+
resource agent, and gives an example of cluster configuration that will
282+
bootstrap a Postgres cluster.
283+
284+
We load pacemaker configuration in tests from the `configure_pacemaker` function
285+
in [start-cluster.bash](docker/postgres-member/start-cluster.bash), though we
286+
advise thinking carefully about what appropriate timeouts might be for your
287+
setup.
288+
289+
The [pgsql](docker/postgres-member/resource_agents/pgsql) resource agent has
290+
been modified to remove the concept of a primary floating IP. Anyone looking to
291+
use this cluster without a floating IP will need to use the modified agent from
292+
this repo, which renders the primary's actual IP directly into Postgres'
293+
`recovery.conf` and reboots database replicas when the primary changes
294+
(required, given Postgres cannot live reload `recovery.conf` changes).
295+
296+
### PgBouncer
297+
298+
We use [lib/pq](https://github.com/lib/pq) to connect to PgBouncer over the unix
299+
socket. Unfortunately lib/pq has [issues](https://github.com/lib/pq/issues/475)
300+
when first establishing a connection to PgBouncer as it attempts to set the
301+
configuration parameters `extra_float_digits`, which PgBouncer doesn't
10302
recognise, and therefore will reject the connection.
11303

12304
To avoid this, make sure all configuration templates include the following:
@@ -16,14 +308,16 @@ To avoid this, make sure all configuration templates include the following:
16308
...
17309
18310
# Connecting using the golang lib/pq wrapper requires that we ignore
19-
# the 'extra_float_digits' startup parameter, otherwise PGBouncer will
311+
# the 'extra_float_digits' startup parameter, otherwise PgBouncer will
20312
# close the connection.
21313
#
22314
# https://github.com/lib/pq/issues/475
23315
ignore_startup_parameters = extra_float_digits
24316
```
25317

26-
## CircleCI
318+
## Development
319+
320+
### CircleCI
27321

28322
We build a custom Docker image for CircleCI builds that is hosted at
29323
gocardless/pgsql-cluster-manager-circleci on Docker Hub. The Dockerfile lives at
@@ -35,10 +329,10 @@ To publish a new version of the Docker image, run:
35329
make publish-circleci-dockerfile
36330
```
37331

38-
## Releasing
332+
### Releasing
39333

40334
We use [goreleaser](https://github.com/goreleaser/goreleaser) to create releases
41-
for pgsql-cluster-manager. This enables us to effortlessly create new releases
335+
for `pgsql-cluster-manager`. This enables us to effortlessly create new releases
42336
with all associated artifacts to various destinations, such as GitHub and
43337
homebrew taps.
44338

docker/postgres-member/start

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
#!/usr/bin/env bash
2-
# Starts a cluster, booting into a bash session inside pg01 that on exit will
3-
# kill and clean-up the containers.
2+
# Starts a cluster, booting into a bash session inside pg01. Logging into other
3+
# machines can be done using docker exec:
4+
#
5+
# $ docker exec -it pg0X /bin/bash
46

57
# run-container => prints <container-id>
68
function run-container() {
@@ -28,6 +30,3 @@ docker exec --detach pg02 /bin/start-cluster "$PG01" "$PG02" "$PG03"
2830
docker exec --detach pg03 /bin/start-cluster "$PG01" "$PG02" "$PG03"
2931

3032
docker exec -it pg01 /bin/bash
31-
32-
docker kill pg01 pg02 pg03
33-
docker rm pg01 pg02 pg03

res/node_roles.svg

Lines changed: 4 additions & 0 deletions
Loading

0 commit comments

Comments
 (0)