Lately I have been working with PostgreSQL and it’s streaming replication functionality. This blog entry will provide a minimal example of how to set it up using PostgresSQL 9.3.
The basic work flow is to configure a PostgreSQL instance, called the master, to stream its change log over the network to a PostgreSQL standby instance. Next is to get a backup of the master data to the standby instance. Followed by setting up the standby instance to receive the replication stream and then start up the standby. Replication should commence once the standby node starts.
An assumption is made that the reader know where the Postgres config files are located relative to $PGHOME.
1. Configure the Master for Streaming Replication
To set up streaming replication on the master, append the following lines to the master’s postgresql.conf file.
archive_mode = on archive_command = 'rsync %p /data/efmmstr/walArchive/%f' max_wal_senders = 5 wal_keep_segments = 7 wal_level = hot_standby # ignored if master hot_standby = on
The archive command will need to be modified to a path that makes sense on your local server. Note that the archive_command itself can vary, the use of rsync is just a my preference.
1.2. Modify the master’s pg_hba.conf file to allow for replication connections.
Appending a line to the master’s pg_hba.conf file similar the one below is needed so that the standby node can connect to the master. Obviously adjust the network addresses and user name to reflect your set up. The example below allows a user named postgres to connect from any host on the 172.16.6 network indicated by the 24 bit subnet mask.
host replication postgres 172.16.6.0/24 md5
2. Copy the master database’s base directory to the standby
To give the standby node an initial database to start with, it is common to run pg_basebackup on the standby instance which will copy a point in time snapshot of the database from the master instance. Since pg_basebackup uses the replication protocol, when this runs successfully, you will know that connectivity described above is correct.
Also, make sure you have connectivity thorough any firewalls etc.
Here is the step; ( on the standby)
postgres@standby> pg_basebackup -U postgres -h efmmstr -D /data/efmstdby/test
One thing to note is that there is a ~/.pgpass file set up that provides the password for the above command. You can provide the password on the command line above if you wish.
3. Set up standby instance.
Once the master’s data has been copied over to the standby node, append the following lines to the postgresql.conf file found in the directory copied by the pg_basebackup step. In the above example the configuration file is in /data/efmstdby/test which was the target of the pg_basebackup command.
archive_mode = on archive_command = 'rsync %p /data/efmstdby/walArchive/%f' max_wal_senders = 5 wal_keep_segments = 7 wal_level = hot_standby # ignored if master hot_standby = on
These are the same lines as on the master except for the archive_command path and should already be present in the file after the pg_basebackup. Again, your setup will vary as to the command itself and the path which may be different then the master. Otherwise these values will work.
3.1 Create a Recovery File on the Standby
Continuing to set up the standby instance you will need to create a recovery.conf file and put it in the directory where the postgresql.conf is located. Typically where the PGDATA environment variable is set. In this example it would be in /data/efmstdby/test. Below is a working example.
standby_mode = on recovery_target_timeline = 'latest' primary_conninfo = 'host = efmmstr port = 5432' trigger_file = '/var/tmp/efmDemo1_trigger'
You will need to alter the primary_conninfo and the trigger_file to reflect your set up. The trigger file is not necessary for this example but is left over from previous work. The trigger file can be used to promote the standby node to a master.
Verification
Assuming no firewall issues etc., you should be able to start the standby and have the streaming replication protocol apply any changes made to the master be replicated to the standby. Note that the entire Postgres instance will be replicated. (All tables, all changes)
To verify the master is master:
psql@ # select pg_is_in_recovery(); pg_is_in_recovery ------------------- f
This returns false if the instance is not in recovery mode and is therefore the master.
On the standby the function returns true:
psql@efmstdby # select pg_is_in_recovery(); pg_is_in_recovery ------------------- t
Lastly:
On the master look at the xlog location:
psql@ # SELECT pg_current_xlog_location(); pg_current_xlog_location -------------------------- 0/D002B60
On the standby look at the last xlog received.
psql@efmstdby # SELECT pg_last_xlog_receive_location(); pg_last_xlog_receive_location ------------------------------- 0/D002B60
If the master and the standby are in sync these values should be the same.
Another way of looking at this on linux is to run ‘ps gx’ on the master and the slave. Look for the wal_sender on the master and the wal_receiver on the standby. You typically will see the xlog location reported there too.
Fine. -Enjoy.