PostgreSQL Clustering: HA-JDBC

A little while ago I posted about PostgreSQL clustering and I said that I was going to evaluate HA-JDBC as an option.

The reason I wanted to use HA-JDBC is that I was looking for a no-coding required solution for seamless fail over. (Similar to Oracle RAC) for PostgreSQL.

I’ll be using a Shared Disk / Heartbeat cluster on the server side, however when the node fails the application will register an error, which is undesirable.

After doing more research it’s been determined that HA-JDBC won’t work.

It seems that HA-JDBC is, at best, a SQL replicator, where you have 2 active nodes and HA-JDBC will perform inserts and updates across both databases to keep them in sync. This is fine if you’re not using the “serial” data type in PostgreSQL, Triggers, Functions, time based default values, etc. Using any sort of trigger, the code fires independently on insert and you end up getting out of sync databases.

Another reason why HA-JDBC won’t work is that if a node is un-reachable HA-JDBC removes it from consideration. So your 2nd “standby” when it becomes active won’t be considered by HA-JDBC without some manual intervention.

And finally, I’d advise steering away from HA-JDBC at this point even if the above works for you. I posted a few clarifying questions regarding the above to both the HA-JDBC forums and their mailing list and received no response. If your business is looking into true high availability for their servers you need to choose all of your components with care. A non-responsive community either means the project is dead, or un-caring, both of which are unacceptable when you’re looking into HA solutions.

Finally got a response from Paul Ferraro, Nice guy, very helpful posted below:

Sorry for the slow response...

HA-JDBC is not the right tool for this job.  HA-JDBC is an
*alternative* to shared disk failover - it was not designed
to be used in concert with it.  Instead, you want something
like JBoss HA DataSource or Weblogic multi-pools.  These are
DataSource proxies whose getConnection() method returns a
raw connection from the first available data source.
DataSource-level.  Connections returned by HA-JDBC's DataSource,
on the other hand, are proxies to connections to each active
database in your cluster.

I can go into more detail if you'd like, such as the advantages/
disadvantages of HA-JDBC over shared-disk failover, if you're
interested.

Paul

Parsing large files with pgfouine in linux

pgfouine is a nice logfile analyzer for PostgreSQL written in php.

I’m doing a trace on a very long running ETL process and the logfile generated is ~11GB.

I’m running up against a 2GB barrier in php for fopen(). If you’ve got a 64bit machine and can recompile php with -D_FILE_OFFSET_BITS=64 then you’re good to go. But in my case, I can’t do either.

The error i’d get is:

PHP Fatal error:  File  is not readable. in /var/lib/pgsql/pgfouine-1.1/include/GenericLogReader.class.php on line 85

So for Plan B I had to remember back to the days when 64 bit wasn’t even an option (back in my day, we had 8 bits and we liked ‘em!)

I used a named pipe since pgfouine expects a file and doesn’t seem to be able to read from stdin.

mknod /tmp/pg2 p
cat /var/log/postgres > /tmp/pg2 | ./pgfouine.php -file /tmp/pg2 > bla.html

Once that kicked off I stopped getting that error and pgfouine was able to process the file.

PostgreSQL Clustering: JDBC

Now that I’ve got my basic active/passive cluster setup using the shared disk Linux heartbeat method mentioned here. One thing is left, and that’s allowing my java app to fail-over to the new database without re-coding the app.

Without updating the JDBC driver you would have to catch the failure at the java container level or in the app itself and manage the switch from the down node to the active node.
I don’t think that’s “industry standard” and it’s certainly not easy by any means.
The normal way is to let the JDBC driver manage it.

Unfortunately the PostgreSQL JDBC driver doesn’t handle this event out of the box so we need to invoke a 3rd party.

There aren’t a lot of options in this area here are two:

I found a good discussion around HA-JDBC here

I’m using Hibernate + Geronimo so i need to do testing to see if that’s going to work with HA-JDBC but from the sounds of it, it should work just fine.

I’ll need to evaluate both of these to determine which is the best.

PostgreSQL HA Clustering Options

I’ve been evaluating PostgreSQL clustering options for my current project.

The reason I’m looking at clustering is that the DB server will be handling a large number of users and any downtime is catastrophic. So reliability comes before any performance or administrative concerns in a clustering solution.

My platform is PostgreSQL 8.3 and SLES Linux.

I looked at 4 Solutions:
Option 1: Shared Disk (Heartbeat) Cluster (Heartbeat: SLES)
Option 2: Filesystem Replication Based (DR:BD / GNDB)
Option 3: DB Replication Based (Slony I)
Option 4: DB Replication Based (PGCluster)

I weighed the pro’s and con’s of each of them and eventually chose Option 1 as the best for my needs.

I like the heartbeat solution because:

  • It’s simple
  • There’s no data loss in a shared disk cluster
  • There’s no replication overhead so no performance impact

Unfortunately, there is very little public documentation regarding heartbeat clusters used with PostgreSQL. I hope to rectify that over the next weeks and months, so stay tuned.

HowTo: Baan Tracing DBSLOG (Part III)

Now let’s get into something slightly more complicated.

DBSLOG to help find out what’s going on when a session “hangs”

I’ve placed a lock on a record in the item master. The bottom of the dbs.log looks like this:

------ QPS Input Row -------
Bind :1   : string  : [0x8185d20] '`               '
----- DBMS Where Input ----
Bind nr 1 : item       : string  : '`               '
SQL> ora_multi_execute( 0x81cb0d8 ) do 0, prefetch 1

There is a lot of information in these few lines.
First: Bind nr 1 : item : string : ‘` ‘
is the binding of “item” to the string “` ” which is a dummy item in my database.

Second: SQL> ora_multi_execute( 0×81cb0d8 ) do 0, prefetch 1
Tells us the hex value for the cursor further up in the dbs.log file that we’re executing.

Now to find out the query that’s locked we just use that hex value 0×81cb0d8 and search backwards in the file:

SQL> ora_parse( "SELECT /*+  index(a ttiitm001200$idx1) */ t$item,t$dsca,t$dscb,t$dscc,t$dscd,t$wght,t$seak,
t$seab,t$kitm,t$citg,t$ctyp,t$csel,t$csig,t$cvat,t$txta,t$uset,t$cuni,t$stgu,t$cwar,t$kltc,t$obpr,t$kpsl,t$npsl,t$pics,
t$abcc,t$lcod,t$uscu,t$usab,t$slmp,t$serv,t$sfst,t$maxs,t$csps,t$cspd,t$cfmd,t$scst,t$stoc,t$blck,t$ordr,t$allo,
t$hall,t$quot,t$ltdt,t$opol,t$osys,t$omth,t$oqmf,t$mioq,t$maoq,t$umer,t$fioq,t$ecoq,t$reop,t$oint,t$ddfq,t$oltm,
t$sftm,t$fodt,t$ocst,t$auso,t$cpha,t$oqdr,t$repi,t$scdl,t$orip,t$pcrp,t$stpm,t$mrpc,t$plmm,t$mrpo,t$eitm,t$bfcp,
t$bfep,t$bfhr,t$ndrp,t$nnts,t$qpnt,t$unom,t$runi,t$scpf,t$crmp,t$tmfc,t$roun,t$ncst,t$llcd,t$llci,t$mrpi,t$stmr,
t$cuqp,t$cupp,t$cpgp,t$csgp,t$pcgp,t$ccur,t$ltpp,t$prip,t$avpr,t$ltpr,t$suno,t$qual,t$purc,t$txtp,t$cuqs,t$cups,
t$cpgs,t$csgs,t$cmgp,t$rbgp,t$pris,t$ltsp,t$prir,t$umsp,t$lmsp,t$ccde,t$ctyo,t$txts,t$cpcp,t$copr,t$matc,t$oprc,
t$cuid,t$actf,t$ltcp,t$stva,t$buyr,t$cplb,t$cppp,t$ccit,t$ccfu,t$ccco,t$prre,t$copt,t$cprp,t$itmt,t$proi,t$cont,t$cntr,
t$czed,t$reli,t$assi,t$potc,t$ffsi,t$qbsi,t$osyc,t$ufra,t$nobd,t$blcm,t$dcnt,t$exkb,t$itm2,t$Refcntd,t$Refcntu
FROM baan.ttiitm001200 WHERE t$item=:1 FOR UPDATE WAIT 9000 " )
SQL> ora_parse( 0x81cb0d8 )

You see the ora_parse (0×81cb0d8) which matches up with the ora_multi_execute

So you know that in this particular case, the session is “hanging” because it cannot acquire a lock on ttiitm001200.

You use the same process to identify slow queries, wrong queries, etc.