Sunday, May 13, 2012

A script to see row counts and table size in PostgreSQL

Here is a little perl script that I use to give me a list of tables in a schema, a rowcount and the pg_relation_size of a table. The usage is just:
pg_rowcount <database> <schema>
ex: ./pg_rowcount.pl david.kerr pg_catalog
The output looks like this:
Table                                Rows                 Size
--------------------------------------------------------------
pg_statistic                         1853              1672 kB
pg_type                               910               168 kB
pg_attribute                         9569              1488 kB
pg_class                             1371               416 kB
pg_authid                              52                16 kB
pg_index                              766               112 kB
pg_operator                           705               104 kB
pg_database                            44                16 kB
[...]
You can download the script here

How To Fix Stalled CUPS Printing

Every once in a while CUPS printing just stalls, no reason that I can tell. (I blame the network, naturally) but what ends up happening is that the print jobs get backed up and you need to clear out the queue. The fastest way to do this is to do:
rm /var/spool/cups/*
/etc/init.d/cups restart

How To Reset Sequences in PostgreSQL

I use a lot of "serial" data type fields, sometimes the nextval of my sequence doesn't reflect the max() of my table. Since I do a lot of moving data around these can get messed up. The basic command that you want to use to reset a sequence is:
select setval('sequence_name', value);
However, the cool thing about Postgres is that can be a query too so I use:
select setval('sequence_name', select max(column_name) from tablename);
So I wrote a little perl script to look through all of the columns that have a default like "nextval" (which if you look in the database is how a serial column actually looks) and sets the sequence to the max of the value in that table. You can get it here
reset_requences.pl -d  [-s ] [-u ] [-p ]
If you specify a database it will reset all of the columns in the database If you specify a schema it will reset all of the columns within a specific schema. If you specify a user it will reset all of the columns for a specific user.

Using generate_series() in PostgreSQL

If you want to dummy up a bunch of data in PostgreSQL you can use this neat little trick
create table test (a int);
insert into test (select generate_series(0,999,1));
INSERT 0 1000;

Using insert .. returning with perl and PostgreSQL

PostgreSQL provides the ability to return a value on insert. so for example:

create table foo (id serial, name text);
insert into foo('dave') returning id;
Will return the auto-generated value for "id" assigned to 'dave'
You can leverage this from within Perl by doing something like this:
my $foo_insert = "insert into foo(?) returning id"
my $foo_cur = $dbh->prepare($foo_insert);
$foo_cur->execute('Dave');
my $foo_rec = $foo_cur->fetchrow_hashref();
print $foo_rec{"id"};

Parsing Large files with Pgfouine

ppgfouine is a nice logfile analyzer for PostgreSQL written in php.

When doing a trace on a very long running ETL process the logfile generated was ~11GB.

This left me running up against a 2GB barrier in php for fopen().

If you've got a 64bit machine and can recompile php with -D_FILE_OFFSET_BITS=64 then you're good to go. But in my case, I wasn't able to do either.

 The error i'd get is: PHP Fatal error: File is not readable. in /var/lib/pgsql/pgfouine-1.1/include/GenericLogReader.class.php on line 85

The simple solution was to use a named pipe since pgfouine expects a file and doesn't seem to be able to read from stdin.

mknod /tmp/pg2 p
cat /var/log/postgres > /tmp/pg2 | ./pgfouine.php -file /tmp/pg2 > bla.html
Another variation on this, for those of us using UTC and find that our postgres log is rotated mid-day is to do something like:
(gzcat /var/log/postgres.xyz.gz && cat /var/log/postgres) > /tmp/pg2 | ./pgfouine.php -file /tmp/pgt > bla.html

How To Add Quotes to results in dynamic SQL


I frequently find the need to have single quoted output when i generate dynamic SQL.

It's always a pain to remember the exact number of ticks needed to get the quoted output.

Here is a reminder on how to do it:
select ''''||schemaname||'.'||tablename||'''' from pg_tables
i.e., four quotes when there's no text, or three quotes on the outside and one on the inside when there is text.
select '''public.'||tablename||''';' from pg_tablesselect 'select pg_size_pretty(pg_relation_size('''||tablename||'''));' from pg_tables where schemaname = 'bla';

Moving to blogger.com

Until now, I've more or less preferred to run my own hosted blogging platform. However, my server is getting old and unreliable and with the plethora of free services available, I can't justify the cost of an upgrade or really continued hosting.

So I'll give this a try and see if I can live with it. So far I like the customization options.

I've migrated my most popular posts, however I haven't kept the permalinks in tact so you may need to search or check the tags. (there wasn't much there anyway)
Web Statistics