PostgreSQL Clustering: JDBC

Now that I've got my basic active/passive cluster setup using the shared disk Linux heartbeat method mentioned here. One thing is left, and that's allowing my java app to fail-over to the new database without re-coding the app.

Without updating the JDBC driver you would have to catch the failure at the java container level or in the app itself and manage the switch from the down node to the active node.
I don't think that's "industry standard" and it's certainly not easy by any means.
The normal way is to let the JDBC driver manage it.

Unfortunately the PostgreSQL JDBC driver doesn't handle this event out of the box so we need to invoke a 3rd party.

There aren't a lot of options in this area here are two:

I found a good discussion around HA-JDBC here

I'm using Hibernate + Geronimo so i need to do testing to see if that's going to work with HA-JDBC but from the sounds of it, it should work just fine.

I'll need to evaluate both of these to determine which is the best.

PostgreSQL HA Clustering Options

I've been evaluating PostgreSQL clustering options for my current project.

The reason I'm looking at clustering is that the DB server will be handling a large number of users and any downtime is catastrophic. So reliability comes before any performance or administrative concerns in a clustering solution.

My platform is PostgreSQL 8.3 and SLES Linux.

I looked at 4 Solutions:
Option 1: Shared Disk (Heartbeat) Cluster (Heartbeat: SLES)
Option 2: Filesystem Replication Based (DR:BD / GNDB)
Option 3: DB Replication Based (Slony I)
Option 4: DB Replication Based (PGCluster)

I weighed the pro's and con's of each of them and eventually chose Option 1 as the best for my needs.

I like the heartbeat solution because:

  • It's simple
  • There's no data loss in a shared disk cluster
  • There's no replication overhead so no performance impact

Unfortunately, there is very little public documentation regarding heartbeat clusters used with PostgreSQL. I hope to rectify that over the next weeks and months, so stay tuned.

HowTo: Baan Tracing DBSLOG (Part III)

Now let's get into something slightly more complicated.

DBSLOG to help find out what's going on when a session "hangs"

I've placed a lock on a record in the item master. The bottom of the dbs.log looks like this:

------ QPS Input Row -------
Bind :1   : string  : [0x8185d20] '`               '
----- DBMS Where Input ----
Bind nr 1 : item       : string  : '`               '
SQL> ora_multi_execute( 0x81cb0d8 ) do 0, prefetch 1

There is a lot of information in these few lines.
First: Bind nr 1 : item : string : '` '
is the binding of "item" to the string "` " which is a dummy item in my database.

Second: SQL> ora_multi_execute( 0x81cb0d8 ) do 0, prefetch 1
Tells us the hex value for the cursor further up in the dbs.log file that we're executing.

Now to find out the query that's locked we just use that hex value 0x81cb0d8 and search backwards in the file:

SQL> ora_parse( "SELECT /*+  index(a ttiitm001200$idx1) */ t$item,t$dsca,t$dscb,t$dscc,t$dscd,t$wght,t$seak,
t$seab,t$kitm,t$citg,t$ctyp,t$csel,t$csig,t$cvat,t$txta,t$uset,t$cuni,t$stgu,t$cwar,t$kltc,t$obpr,t$kpsl,t$npsl,t$pics,
t$abcc,t$lcod,t$uscu,t$usab,t$slmp,t$serv,t$sfst,t$maxs,t$csps,t$cspd,t$cfmd,t$scst,t$stoc,t$blck,t$ordr,t$allo,
t$hall,t$quot,t$ltdt,t$opol,t$osys,t$omth,t$oqmf,t$mioq,t$maoq,t$umer,t$fioq,t$ecoq,t$reop,t$oint,t$ddfq,t$oltm,
t$sftm,t$fodt,t$ocst,t$auso,t$cpha,t$oqdr,t$repi,t$scdl,t$orip,t$pcrp,t$stpm,t$mrpc,t$plmm,t$mrpo,t$eitm,t$bfcp,
t$bfep,t$bfhr,t$ndrp,t$nnts,t$qpnt,t$unom,t$runi,t$scpf,t$crmp,t$tmfc,t$roun,t$ncst,t$llcd,t$llci,t$mrpi,t$stmr,
t$cuqp,t$cupp,t$cpgp,t$csgp,t$pcgp,t$ccur,t$ltpp,t$prip,t$avpr,t$ltpr,t$suno,t$qual,t$purc,t$txtp,t$cuqs,t$cups,
t$cpgs,t$csgs,t$cmgp,t$rbgp,t$pris,t$ltsp,t$prir,t$umsp,t$lmsp,t$ccde,t$ctyo,t$txts,t$cpcp,t$copr,t$matc,t$oprc,
t$cuid,t$actf,t$ltcp,t$stva,t$buyr,t$cplb,t$cppp,t$ccit,t$ccfu,t$ccco,t$prre,t$copt,t$cprp,t$itmt,t$proi,t$cont,t$cntr,
t$czed,t$reli,t$assi,t$potc,t$ffsi,t$qbsi,t$osyc,t$ufra,t$nobd,t$blcm,t$dcnt,t$exkb,t$itm2,t$Refcntd,t$Refcntu 
FROM baan.ttiitm001200 WHERE t$item=:1 FOR UPDATE WAIT 9000 " ) 
SQL> ora_parse( 0x81cb0d8 )

You see the ora_parse (0x81cb0d8) which matches up with the ora_multi_execute

So you know that in this particular case, the session is "hanging" because it cannot acquire a lock on ttiitm001200.

You use the same process to identify slow queries, wrong queries, etc.

HowTo: Baan Tracing DBSLOG (Part II)

Now you've generated your DBSLOG=1570 output let's look at what you've got:
DBSLOG is nice enough to tell you in your logfile exactly what flags were used to generate it.

<6692> 2009-11-09[20:17:44]: Logging started mode 01570
        ---- LOG ROW   INFO [0000010] ----
        ---- LOG TABLE INFO [0000020] ----
        ---- LOG DB    INFO [0000040] ----
        ---- LOG DBMS  INFO [0000100] ----
        ---- LOG SQL   INFO [0000400] ----
        ---- LOG DEBUG INFO [0001000] ----

This next part is more valueable than you might think. These are the parameters that your bshell/driver are picking up. This is a great place to look if you've set a parameter in your db_resource or tabledef and it doesn't appear to be getting used.

It's also a great way to find hidden tuning parameters ;)

Portingset mode 6.1c

oracle_attach_server 
Timeout values : [9000] [900] [900] [900] [900]
allocate_sql_buffer_area size 32000.
oracle client version = 10.2.0.4.0
oracle_attach_server done.
Resources:      (E = env, P = putenv, D = default )
    lock_retry                E     "0"
    locale                      D   ""
    bdb_max_session_schedule  E D   -1
    bdb_max_sessions          E D   -1
    rds_full                  E     5
    tt_sql_trace              E D   037777777777 (oct)
    dbslog                    E     01570 (oct)
    dbslog_name               E D   "dbs.log"
    dbsinit                   E     021 (oct)
    baan_single_shot_queries  E D   0
    max_tables_joined         E D   -1
    cs_owner                    D   0 (oct)
    enable_refmsg             E D   0
    use_binary_columns        E D   0
    hint_idx_weight_equal     E D   1
    hint_idx_weight_range     E D   1
    hint_idx_weight_factor    E D    1.00000000>
    tbase_refresh               D   -1
    concat_expr               E D   ""
    aud_init                  E D   0
    audit_mask                E D   0 (oct)
    baan_sql_trace            E D   0 (oct)
    baan_sql_cacherows        E D   -1
    nls_lang                  EP    "american_america.we8iso8859p1"
    nls_sort                  EPD   ""
    oracle_home               EP    "/apps/oracle/10gR2"
    oracle_service_name       E D   ""
    oracle_sid                EP    "BAANTEST"
    two_task                  E D   ""
    oracle_client_home        E D   "/apps/oracle/10gR2"
    oracle_server_home        E D   "/apps/oracle/10gR2"
    oracle_local_template     E D   ""
    ora_sqlnet_compression    E D   1
    epc_disabled              EPD   "TRUE"
    ora_max_array_fetch       E     5
    ora_max_array_insert      E     5
    baan_oracle_prefetch      E D   1
    ansi_outer_join           E D   0
    use_oci7_interface        E D   0
    max_free_cursors          E D   32
    retained_cursors          E D   50
    max_sql_buffer            E D   32000
    sql_trace                 E D   ""
    ora_timeout               E     "{9000,900,900,900,900}"
    optimizer_goal            E D   ""
    ora_alter_session         E D   ""
    ora_default_tablespace    E     "baandata"
    ora_temporary_tablespace  E     "temp"
    ora_init                  E     0101000 (oct)
    ora_level1                E D   0
    max_open_cursors          E D   245
    ora_date                  E D   0
    oraprof                   E D    -1.00000000>
    oraprof_name              E D   "oraprof"
    orastat                   E D    -1.00000000>
    orastat_name              E D   "orastat"

Next it starts the connection to the database process. I'll simulate an error 510 (database not on) so you can see where you can see where it would go wrong in the connection.
Here you see the error 510, the resulting oracle errors

Error ORA-1034 occurred during logon.
ORA-01034: ORACLE not available
ORA-27101: shared memory realm does not exist
Linux-x86_64 Error: 2: No such file or directory

and the info you need to find the problem:

ORACLE_SID=bla

Naturally, my database SID isn't "bla"

Here's the full trace:

Msg_type 13 received.
Msg_type 2 received.
oracle_open_session sid 1.
Open session : logon 
ora_logon : name = 'bsp', pwd = .. ora_session 0x816f588 
using resource oracle_local_template
oci_link_server ( dbase = '(DESCRIPTION=(SDU=8192)(TDU=8192)(ADDRESS=(PROTOCOL=beq)(PROGRAM=/apps/oracle/10gR2/bin/oracle)(ARGV0=oraclebla)(ARGS='(DESCRIPTION=(LOCAL=YES)(SDU=8192)(TDU=8192)(ADDRESS=(PROTOCOL=beq)))')(ENVS='ORACLE_HOME=/apps/oracle/10gR2,ORACLE_SID=bla'))(CONNECT_DATA=(SID=bla)))' ) : first time linked : attach connection : # logon = 1 : OK
ora_logon : OCISvcCtx  0x81b2a78 Allocated
oracle server version = 10.2.0.4.0

ora_error 1034 dbs_errno 510.
oci_unlink_server ( dbase = '(DESCRIPTION=(SDU=8192)(TDU=8192)(ADDRESS=(PROTOCOL=beq)(PROGRAM=/apps/oracle/10gR2/bin/oracle)(ARGV0=oraclebla)(ARGS='(DESCRIPTION=(LOCAL=YES)(SDU=8192)(TDU=8192)(ADDRESS=(PROTOCOL=beq)))')(ENVS='ORACLE_HOME=/apps/oracle/10gR2,ORACLE_SID=bla'))(CONNECT_DATA=(SID=bla)))', # logon = 1 ) : detach connection : OK
ora_logon error: ora_session 0x816f588 err 510
FATAL /view/port.6.1c.07.08/vobs/tt/servers/ORACLE_2/ora_native.c:#1769 :

Error ORA-1034 occurred during logon.
ORA-01034: ORACLE not available
ORA-27101: shared memory realm does not exist
Linux-x86_64 Error: 2: No such file or directory

Error BDB-510 returned.
Check the Oracle settings:
NLS_LANG = 'american_america.we8iso8859p1'
ORACLE_HOME (client) = '/apps/oracle/10gR2' (resource oracle_client_home)
ORACLE_HOME (server) = '/apps/oracle/10gR2' (resource oracle_server_home)
Oracle Service Name = '' (resource oracle_service_name --> TWO_TASK)
ORACLE_SID = 'bla'
Connection used oracle_local_template ( ? = oracle_server_home, @ = oracle_sid ):
(DESCRIPTION=(SDU=8192)(TDU=8192)(ADDRESS=(PROTOCOL=beq)(PROGRAM=/apps/oracle/10gR2/bin/oracle)(ARGV0=oraclebla)(ARGS='(DESCRIPTION=(LOCAL=YES)(SDU=8192)(TDU=8192)(ADDRESS=(PROTOCOL=beq)))')(ENVS='ORACLE_HOME=/apps/oracle/10gR2,ORACLE_SID=bla'))(CONNECT_DATA=(SID=bla)))

Logon failed; errno 510
Msg_type 1 received.
In detach server 
oracle_detach_server 
oracle_detach_server done.
In detach server 

HowTo: Baan Tracing DBSLOG (Part I)

DBSLOG is my favorite toy. It looks all communications between the driver and the database.

I'm mostly familiar with the Oracle driver, so that's what i'll show here. Other drivers look different in DBSLOG.

Here are the options that can be sent to DBSLOG:

00000 – Turn DBSLOG tracing off
00001 -  Display data dictionary information on queried tables
00002 -  Display Query information for level one database queries
00004 -  Display query execution plan during a level two query
00010 -  Row level action (read, modify, delete) information
00020 -  Table level action information
00040 -  Translation information 
00100 -  DBMS input/output data on a level two query
00200 -  Administrative file information for SQL drivers
00400 – DBMS SQL query statements
01000 -  General debug statements
02000 – Query processing information
04000 – Data buffering information passed thru communication channels

Basically you add the values to get the level of debugging level you want.

You might think that setting DBSLOG=99999 would give you everything you need, but that's not always true. Some flags are mutually exlusive, and some flags add so much information as to make the file unreadable.

That's why I always use DBSLOG=01570 which gives you:

General debug statements (1000) + 
DBMS SQL query statements (400) + 
DBMS input/output data on a level two query (100) + 
Translation information (40) + 
Table level action information (20) + 
Row level action (read, modify, delete) information (10)

Unlike the bshell debugging commands DBSLOG is set via the environment so you would use

Unix:
export DBSLOG=01570
Windows
set DBSLOG=1570
Baan:
-- -set DBSLOG=1570

The logfile will be called dbs.log and will be located in your home directory in unix or $BSE/tmp in windows.


About

Random Database, OS or otherwise interesting tips and tricks.

User


Clicky Web Analytics