Lately I have been working with PostgreSQL and it’s streaming replication functionality. This blog entry will provide a minimal example of how to set it up using PostgresSQL 9.3.

The basic work flow is to configure a PostgreSQL instance, called the master, to stream its change log over the network to a PostgreSQL standby instance. Next is to get a backup of the master data to the standby instance. Followed by setting up the standby instance to receive the replication stream and then start up the standby. Replication should commence once the standby node starts.

An assumption is made that the reader know where the Postgres config files are located relative to $PGHOME.

1. Configure the Master for Streaming Replication

To set up streaming replication on the master, append the following lines to the master’s postgresql.conf file.

archive_mode = on
archive_command = 'rsync %p /data/efmmstr/walArchive/%f'
max_wal_senders = 5
wal_keep_segments = 7
wal_level = hot_standby
# ignored if master
hot_standby = on

The archive command will need to be modified to a path that makes sense on your local server. Note that the archive_command itself can vary, the use of rsync is just a my preference.

1.2. Modify the master’s pg_hba.conf file to allow for replication connections.

Appending a line to the master’s pg_hba.conf file similar the one below is needed so that the standby node can connect to the master. Obviously adjust the network addresses and user name to reflect your set up. The example below allows a user named postgres to connect from any host on the 172.16.6 network indicated by the 24 bit subnet mask.

host replication postgres 172.16.6.0/24 md5

2. Copy the master database’s base directory to the standby

To give the standby node an initial database to start with, it is common to run pg_basebackup on the standby instance which will copy a point in time snapshot of the database from the master instance. Since pg_basebackup uses the replication protocol, when this runs successfully, you will know that connectivity described above is correct.

Also, make sure you have connectivity thorough any firewalls etc.

Here is the step; ( on the standby)

postgres@standby>  pg_basebackup -U postgres -h efmmstr -D /data/efmstdby/test

One thing to note is that there is a ~/.pgpass file set up that provides the password for the above command. You can provide the password on the command line above if you wish.

3. Set up standby instance.

Once the master’s data has been copied over to the standby node, append the following lines to the postgresql.conf file found in the directory copied by the pg_basebackup step. In the above example the configuration file is in /data/efmstdby/test which was the target of the pg_basebackup command.

archive_mode = on
archive_command = 'rsync %p /data/efmstdby/walArchive/%f'
max_wal_senders = 5
wal_keep_segments = 7
wal_level = hot_standby
# ignored if master
hot_standby = on

These are the same lines as on the master except for the archive_command path and should already be present in the file after the pg_basebackup. Again, your setup will vary as to the command itself and the path which may be different then the master. Otherwise these values will work.

3.1 Create a Recovery File on the Standby

Continuing to set up the standby instance you will need to create a recovery.conf file and put it in the directory where the postgresql.conf is located. Typically where the PGDATA environment variable is set. In this example it would be in /data/efmstdby/test. Below is a working example.

standby_mode = on
recovery_target_timeline = 'latest'
primary_conninfo = 'host = efmmstr port = 5432'
trigger_file = '/var/tmp/efmDemo1_trigger'

You will need to alter the primary_conninfo and the trigger_file to reflect your set up. The trigger file is not necessary for this example but is left over from previous work. The trigger file can be used to promote the standby node to a master.

Verification

Assuming no firewall issues etc., you should be able to start the standby and have the streaming replication protocol apply any changes made to the master be replicated to the standby. Note that the entire Postgres instance will be replicated. (All tables, all changes)

To verify the master is master:

psql@ # select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f

This returns false if the instance is not in recovery mode and is therefore the master.

On the standby the function returns true:

psql@efmstdby # select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t

Lastly:

On the master look at the xlog location:

psql@ # SELECT pg_current_xlog_location();
pg_current_xlog_location
--------------------------
0/D002B60

On the standby look at the last xlog received.

psql@efmstdby # SELECT pg_last_xlog_receive_location();
pg_last_xlog_receive_location
-------------------------------
0/D002B60

If the master and the standby are in sync these values should be the same.

Another way of looking at this on linux is to run ‘ps gx’ on the master and the slave. Look for the wal_sender on the master and the wal_receiver on the standby. You typically will see the xlog location reported there too.

Fine. -Enjoy.

After working on the Map-Reduce word counting example and tinkering with a couple of other map-reduce programs, I thought I would try something with Hive. I ended up doing the same word count exercise because it is simple to set up, and though trivial, gives a nice contrast to using map-reduce directly. Briefly, Hive presents a SQL like interface to Hadoop. The Hive query language is called QL and like SQL it is declarative. That means one would use QL to declare what results are wanted and not specify any detailed procedures for getting the result. Hive goes and figures out the best way to answer the request. This is in contrast to the map-reduce example I presented before which uses a procedural approach, one has to program all the steps in the form of mappers, and reducers in order to produce the desired result. QL hides the map-reduce procedures from the user just like a SQL based relational database abstracts away the procedures and structures used to return results. In that way QL is easier to use than programming map-reduce but it is less flexible and requires more structure to the data. To be fair using map reduce directly in Hadoop also hides much of the details of distributed computing by automatically partitioning input data, scheduling tasks across a cluster, scaling, inter-machine communication and handling failures. This allows programmers to focus their work of how to express a problem in terms of key/value pairs and the needed aggregation on like keys etc. Once that is done what remains is writing rather straight forward mapper and reducer methods. This is further aided by using and extending the rich set of Hadoop and Java API’s that are part of the implementation and framework. Hive sits on top of all this hiding even more of the complexity of haddoop. I do not want to create the false impression that Hive and map-reduce have nothing to do with each other. Hive takes the QL statements and “compiles” them into map-reduce jobs and submits them to the Hadoop cluster for execution. The results emitted by the reduce phase are displayed by Hive to the end user. As you would expect, the query language needs to have some structure in order for it to read,write and process the data. As such, in order to use Hive you must create a schema and load data into it. In order to facilitate that, Hive has a data definition language (DDL). used to create tables with attributes and data types and other structures. Hive makes Hadoop look a lot like a SQL database. Again, in the interest in guarding against false impressions, there are many other differences between Hive and relational technology. I suggest reading this for a more comprehensive description of Hive. Note that Hive use cases are data warehouse use cases involving long scans rather then short reads or updates as in a transactional application. I do not believe that Hive supports atomic transactions, commits and roll-backs, etc. Things are evolving fast however and a lot of work is being done by many to merge and blur the distinction between a SQL relational database and the Hadoop map-reduce HDFS framework. Specifically look at the work Hortonworks, Cloudera and Hadapt are doing on this topic. More then enough said, lets get on with the simple word count exercise using Hive. One more thing, the title of this blog is taken from an old song by the Nitty Gritty Dirt band entitled. “The House at Pooh Corner”, which was based on some famous children’s books by A. A Milne in which the main character is a stuffed bear named Winnie-the-pooh. Very famous indeed. Now on this this simple example. Step 1. Create the schema and prep the data. This example uses one table with one column in it. I called the table pooh and give it a column called words. Here is the interaction with Hive for this step.

hive > CREATE DATABASE mytest;
hive > USE mytest;
hive > CREATE TABLE pooh
(words string);
hive > DESC pooh;
OK
words	string
OK

So far looks pretty familiar to anyone who has used a database such as MySQL One thing to point out is we use a data type called string instead of varchar which is commonly used in a database. Step 2. Load the data into the table I wrote a short Java program to take a text file and output every word on one line. It converts all the text to lower cases and strips out punctuation except apostrophes and hyphens. The file with the single word lines is then loaded into the single column table created in step 1.

hive > LOAD DATA LOCAL INPATH
/home/tgendron/dataFiles/poohWords.out
INTO TABLE mytest.pooh;
Copying data from file:
/home/tgendron/dataFiles/poohWords.out
[...snip...]
Loading data to table mytest.pooh
chgrp: changing ownership of
'/user/hive/warehouse/mytest.db/pooh/poohWords.out':
 User does not belong to hive
[...snip...]
Time taken: 0.797 seconds

Once again, we see very familiar syntax to MySQL. Hive was pretty chatty about its actions and the error about the user not being in Hive does not seem to matter to this excersice. The data was loaded and queries on the table did what was expected. Step 3 Using Hive to count the words in a Winnie -the-Pooh book. .

hive > SELECT words, count(*) FROM pooh GROUP BY words;

Total MapReduce jobs = 1
 Launching Job 1 out of 1
Number of reduce tasks not specified.
Estimated from input data size: 1
[...snip]
Hadoop job information for Stage-1:
number of mappers: 1; number of reducers: 1
 [...snip]
Time taken: 36.297 seconds
[...snip]

And the results:

OK
 a         32
about      3
after      1
all       15
always     1
and       69
another    2
[snip]
terrifying     1
than     14
thank     8
thank-you     2
thanks     1
that     259
that's     37
thaws     1
the     717
the-pooh     2
their     11
them     52
themselves     4
then     139
there     111
[....snip...]
Time taken: 34.18 seconds

There you have it. Counting words with Hive. Something interesting to note is the time taken. Just over 34 seconds. I have scalability but not much speed. Lets try this with Vertica which is also a scalable solution.

=> \timing
Timing is on.
=> select words, count(*) from pooh group by words;
[...snip...]
Time: First fetch (1000 rows): 37.122 ms.
All rows formatted: 1821.101 ms

Wow big difference. Others have shown this sort of thing too. Look at this blog on counting triangles. There is also these results at UC Berkeley. Scalability, Availability, Speed. Pick two or three. I am a little partial to Vertica since I have a little experience with it. It would be good to see the how the performance of vertica compares to Spark and Impala. Just looking at the ratios on the Berkeley benchmarks of these againts Hive, I think Vertica has a significant performance advantage. And you don’t give up scalability or availability. Also you can use the same commodity hardware. Not to mention excellent compression. Just saying…. Use the best tool for the job what ever that may be for your particular task.

Tom Gendron

Technica for data management, Linux and other things

Setting up Postgres Streaming Replication

2. Copy the master database’s base directory to the standby

3. Set up standby instance.

3.1 Create a Recovery File on the Standby

Verification

Count all the Bees in the “Hive”