Database consulting and application development...

...because your data is your business

 
Dec. 6, 2018, 12:37 p.m.

BDR talk by Mark Wong of 2nd Quadrant

In the Bay Area? This Wednesday, 2018-12-12, Mark Wong from 2nd Quadrant talking about BDR (Bi-Directional Replication, a form of multi-master) for PostgreSQL. This is a great chance to get inside, real-time info on BDR.

Multi-master is one of those features where when you need it, you really, really need it, and if you're in that category, this talk is for you. It's also of interest to anyone trying to figure out the best solution for scaling and redundancy beyond one machine and one data center.

To attend, you must RSVP at Meetup with your full name (for building security's guest list).


Posted by Quinn Weaver
Oct. 30, 2018, 10:56 p.m.

Remember your history

PostgreSQL keeps track of which WAL files go with which timelines in small history files. Each time you make a base backup, a history file is born. The file is written once and never updated. It's a simple system, and it works well and silently.

In fact, sometimes it works a little too silently.

At PostgreSQL Experts we've run into the problem where a client's history files disappear because they are stored in S3, and there's a lifecycle configuration in place that says to move everything over a certain age to Glacier. That's a good policy for WAL files!

Unfortunately, it's not a good policy for history files: without the latest history file you can't restore the latest backup, and without past history files, you are unable to do PITR to certain points in time.

The solution we used was to move the whole WAL archive to S3 Standard-Infrequent Access storage, dissolving the problem with lifecycle configurations while controlling costs. But you could also fix this by editing the lifecycle configuration.

The important thing is this: hold on to all history files. They're tiny text files, and when you need them, you really need them. This is also a good reason to test restores, not just of the latest backup, but of database states at arbitrary points in time.

*    *    *

Addendum: another very common problem we see is WAL archives that become corrupted because a client accidentally pointed a two primaries at the same WAL archive (for instance, they might have copied a postgresql.conf file by hand, or via a DevOps tool like Puppet). In this case, the whole archive is corrupted, and you're best off starting with a fresh S3 bucket or an empty directory and doing a new base backup immediately.

One of the many nice features of pgBackRest is that it will notice this and prevent you from doing it. Fewer footguns → better backups.


Posted by Quinn Weaver
Oct. 29, 2018, 11:23 p.m.

It's just an expression

PostgreSQL has lots of great features for text search.

In particular, there are a bunch of ways to do case-insensitive searches for text:

  • There's standard SQL ILIKE… but than can be expensive — especially if you put %'s at both start and end — and it's overkill if you want an exact string match.
  • The same goes for case-insensitive regexp matching: overkill for simple case-insensitive matches. It does work with indexes, though!
  • Then there's the citext extension, which is pretty much the perfect answer. It lets you use indexes and still get case-insensitive matching, which is really cool. It Just Works.
OK, but what if you didn't have the foresight to use citext? And what if you don't want to go through the pain of changing the data type of that column? In a case like this, an expression index can be really handy.

Without an index, a case-insensitive match like this…

sandbox# select addy from email_addresses where lower(addy) = 'quinn@example.com';

… is forced to use a sequential scan, lowercasing the addy column of each row before comparing it to the desired address:

                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------
 Seq Scan on email_addresses  (cost=0.00..1.04 rows=1 width=32) (actual time=0.031..0.032 rows=1 loops=1)
   Filter: (lower(addy) = 'quinn@example.com'::text)
   Rows Removed by Filter: 3
 Planning time: 0.087 ms
 Execution time: 0.051 ms
(5 rows)


For my toy example, a four-row table, that's not too expensive, but for a real table with thousands or millions of rows, it can become quite a pain point. And a regular index on email_addresses(addy) won't help; the lower() operation forces a sequential scan, regardless of whether an index is present.

But an expression index will do the trick. An astute commenter noted that, in my toy example, walking the index is actually slower than a sequential scan. But for a table with many rows (most of which are not the one email address I'm looking for!) an index scan will be dramatically faster.

sandbox# create index email_addresses__lower__addy on email_addresses (lower(addy));
CREATE INDEX
sandbox# explain analyze select addy from email_addresses where lower(addy) = 'quinn@example.com';
                                                                  QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using email_addresses__lower__addy on email_addresses  (cost=0.13..8.15 rows=1 width=32) (actual time=0.093..0.095 rows=1 loops=1)
   Index Cond: (lower(addy) = 'quinn@example.com'::text)
 Planning time: 0.105 ms
 Execution time: 0.115 ms
(4 rows)

In short, expression indexes are one of those neat tricks that can save you a lot of pain.


Posted by Quinn Weaver
Oct. 23, 2018, 11:09 p.m.

How to set up scram-sha-256 authentication in PostgreSQL

md5: everyone uses it. But it's insecure — unsuitable for authentication or password storage. PostgreSQL 10 adds support for SCRAM SHA-256 authentication, which is far better1… but how do you use it? Documentation is scarce.

Below I give the complete steps, for two different cases:

  • The case where you migrate every user.
  • The case where you migrate only a subset of users. Step 1 explains why you might need to do this. In this case, you can save time and complexity by creating a containing role, adding all the users you want to migrate to that role, and then configuring that role to work with scram-sha-256. That way you don't have to do it once for each user.
Enough preamble; let's begin!
  1. First, decide which users to migrate to scram-sha-256. The ideal is to migrate everyone, all at once. In practice you may be unable to do that, for a few reasons:
    • Older client tools don't support scram-sha-256. You need at least psql 10, or pgAdmin 4.
    • Your app's client libraries may not support scram-sha-256. At a minimum, you will probably have to upgrade your client libraries2.
    • And it's not enough just to upgrade your libraries; you also have to change the password for each user who's migrating, because PostgreSQL needs to hash this password in scram-sha-256 format. There's no automatic rehash (there can't be, since PostgreSQL doesn't store the cleartext of the old password).
    • Finally, besides your own code, you may have third-party apps connecting to your database, whose library versioning policy and password reset policy you don't control.
  2. Having chosen the users to migrate, upgrade your relevant client libraries, psql/postgresql-client packages, and/or pgAdmin versions.
  3. If you're migrating all users, skip to the next step. If, instead, you're migrating a subset of users, then create a new role, and put all those users in that role. This step saves a bit of repetitive config work.
  4. $ sudo su - postgres
    psql
    postgres=# create role scram_sha_256_users nologin ROLE alice, bob, eve;
    … where alice, bob, and eve are the users you're moving to scram-sha-256. We'll set their passwords in a minute.
  5. Edit pg_hba.conf (make a backup copy first, and don't do a reload yet! We'll do that in a future step). If you're migrating all users, you just change 'md5' to 'scram-sha-256' wherever it appears in the file. In that case, also edit your postgresql.conf to set
  6. password_encryption = 'scram-sha-256'
    If, instead, you're migrating a subset of users, the pg_hba.conf edits depend on which users you're adding. As appropriate for your setup, alter or remove any lines that tell PostgreSQL to use md5 authentication for your affected users; then add equivalent lines for scram-sha-256 authentication.

    For instance, if those users were formerly allowed to log in to all DBs with an md5 password, either locally or from any IPv4 host, then you would do the following. Note that this is only an example, and the particulars depend on your existing setup!
    # Local (Unix domain socket) connections:
    local all +scram_sha_256_users scram-sha-256
    # IPv4 connections, from any IP address:
    host all +scram_sha_256_users 0.0.0.0/0 scram-sha-256
  7. Coordinate with stakeholders to schedule a downtime. During the downtime, do the following:
  8. Stop your app, cron jobs, and any other code that will touch the database (or just be prepared for these to fail when they try to connect).
  9. Do a pg_ctl reload to load the new pg_hba.conf. Note that this will break logins for all the users you're migrating!
  10. Now it's time to change those passwords. Notably, you must pass the cleartext password to your <code>alter user</code>command, so that PostgreSQL itself can salt and (iteratively) hash it; unlike with md5, you can't do the hashing yourself. This is required by the SCRAM protocol.
  11. So first alter the postgres user's ~/.psqlrc, to make sure it won't save cleartext of password-related commands in its history file. Add this line at the bottom:
  12. \set HISTFILE /dev/null
  13. Likewise, make sure the cleartext won't go to the log. Do psql -U postgres, and set these settings (just for this one session):
  14. set log_min_duration_statement = -1;
    set log_statement = 'none';
  15. In the same psql session, change passwords for the relevant users. if your users are going to connect using psql, keep in mind that it won't work with passwords longer than 99 characters. Longer passwords won't crash psql, but it will cause logins to fail.
  16. set password_encryption = 'scram-sha-256';
    alter role scram_sha_256_users set password_encryption = 'scram-sha-256';
    alter role bob with password 'bobsecret';
    alter role alice with password 'alicesecret';
  17. Securely communicate the new passwords to human users.
  18. Update ~/.pgpass entries with the new passwords, for both human users and non-human runners of cron jobs.
  19. Change other application password settings (config files, Vault, et cetera).
  20. Test, test, test! Make sure your apps, human users, and cron jobs can all connect.
  21. End the downtime.
  22. Remove this line from your ~/.psqlrc:
  23. \set HISTFILE /dev/null
With all that done, you are now on much more secure footing. Enjoy!

1. Because SCRAM passwords are stored in as a salted, iterated hash, even if they're leaked, they're harder to crack. Also, the SCRAM authentication protocol is immune to MITM attacks using certs from compromised CAs.
Support varies among libraries, though most wrap libpq, and hence should Just Work if compiled with libpq 10 (JDBC is a notable exception; it doesn't use libpq, so you need to get the right JDBC version).


Posted by Quinn Weaver
Oct. 11, 2018, 7:57 p.m.

HyperLogLog at SFPUG

HyperLogLog is one of those simple, useful, and ingenious algorithms that everyone should know. In SQL terms, it's a way to do COUNT(DISTINCT …) estimates with guaranteed accuracy bounds.

At this month's San Francisco PostgreSQL Users' Group, Sai Srirampur will explain HLL's workings, the postgresql-hll extension, and its applications in distributed PostgreSQL à la Citus.

Check it out!


Posted by Quinn Weaver
Sept. 5, 2018, 5:05 p.m.

Locks talk this Friday, at PostgresOpen! (2018-09-07)

UPDATE: my slides are available; the video has yet to be published.

Attending PostgresOpen?

Come join me Friday for a gentle introduction to locks in PostgreSQL. My example-driven talk covers basic lock theory, tools for lock debugging, and common pitfalls and solutions. I hope to see you there!

Time and place info is on the PostgresOpen SV website.


Posted by Quinn Weaver
July 9, 2017, 5:19 p.m.

A word of praise for TextBelt

Quite often I need to kick off a long-running process for a client, then resume work immediately once it's done. pg_restore, pg_basebackup, pg_upgrade, and vacuumdb --analyze-only all come to mind as examples. The ideal thing is to get a text message upon completion. When I'm working on my own boxes, I can just mail(1) my mobile carrier's text gateway, but I can't count on clients' servers having sendmail or the like set up (they usually don't).

Enter TextBelt. This service is a dead-simple HTTP-to-SMS gateway. Adapted from their docs:


curl -X POST https://textbelt.com/text \
--data-urlencode phone="$MY_MOBILE_NUMBER" \
--data-urlencode message='The process completed.' \
-d key="$MY_TEXTBELT_API_KEY"

Cleartext HTTP is also supported, in case the client box has broken SSL libraries. My texts are always nondescript, so I don't mind sending them in the clear.

The whole thing is open-source, so you can set up your own TextBelt server. Or you can be lazy and throw a few dollars their way for a certain number of texts. I rather like this business model, actually, as a way to support open-source work.


Posted by Quinn Weaver
July 7, 2017, 3:47 p.m.

The PATH of cron

Short version

Postgres Plus keeps psql out /usr/bin, so you need to set PATH in your cron jobs (including for WAL-E).

Longer version

Like all good people, I set up a cron jobs to run nightly WAL-E base backups. With one client, this failed the first time:

wal_e.main ERROR MSG: could not run one or more external programs WAL-E depends upon
DETAIL: Could not run the following programs, are they installed? psql
STRUCTURED: time=2017-07-07T03:00:01.957961-00 pid=25833
Turns out they were using Postgres Plus, and it puts psql in /opt/PostgresPlus/9.3AS/bin. That directory is in the enterprisedb user's PATH, of course, but not in the minimal PATH that cron jobs get by default. So I had to log in as enterprisedb, echo $PATH, and then paste PATH = [what echo said] at the top of my cron job. Moral of the story: take care when setting up cron jobs for PostgreSQL forks, or for non-standard community PostgreSQL installations.


Posted by Quinn Weaver
March 29, 2017, 7:37 p.m.

Change autovacuum_freeze_max_age without a restart (sort of…)

This blog post is kind of involved, so I'm giving a short version at the top, with some background for beginners at the bottom. The middle section explains the motivation for using this hack in the first place.

Short version

I came up with a useful and/or terrible hack the other day: setting autovacuum_freeze_max_age as a storage parameter. I definitely don't recommend doing this routinely, but it unblocked us during a critical maintenance window.

    ALTER TABLE my_table SET (autovacuum_freeze_max_age = 300000000);

Don't forget to set it back when you're done! Otherwise you will incur an even longer autovacuum freeze, probably when you least expect it.

Medium-length version

My colleague Kacey Holston was in the midst of upgrading a client from PostgreSQL 9.4 to 9.6, using Slony for minimal downtime. As planned, the client took a few minutes of downtime so Kacey could do. She was ready to reverse the direction of replication (so the 9.6 server was replicating to the 9.4 server, in case our client to fall back to it). But there was an autovacuum freeze (a.k.a. "autovacuum (to prevent wraparound)" that was keeping Slony from getting the brief ExclusiveLock it needed.

She knew from experience that this table takes three hours to freeze. But the client had only minutes of downtime scheduled – that was the whole point of using Slony!

If only it were possible to change autovacuum_freeze_max_age on the fly; then we could bump it up to stop that autovacuum. Unfortunately, you have to restart the database in order to change it. Except…

You can set it on a per-table basis, as follows. This took effect immediately:

    ALTER TABLE my_table SET (autovacuum_freeze_max_age = 300000000);

If you do this, don't forget to set it back to the normal value (by default, 200000000) once you're done! Otherwise autovacuum freezes on this table will come around less often and take even longer.

Background for beginners:

When the oldest transaction ID on any row in a table is more than autovacuum_freeze_max_age old (200 million transaction old, by default), then an "autovacuum (to prevent wraparound)" process runs on the table to reclaim old transaction IDs. For large tables, this can be a problem, because it can generate a lot of CPU and I/O activity during busy hours. Also, as we saw here, it locks the table (in a SHARE UPDATE EXCLUSIVE mode); this blocks DDL changes (a.k.a. migrations).

For more-technical background, see the official PostgreSQL docs on how transaction IDs work, and for a friendlier intro, see this series of blog posts.


Posted by Quinn Weaver
Jan. 29, 2016, 2:46 p.m.

Language thoughts

All the languages I've used heavily have one primitive data structure you fall into using as a golden hammer:

  • C: arrays
  • Lisp: lists
  • Perl: hashes
  • JS: "objects"
Et cetera. Python is borderline; it's tempting to abuse dicts, but there's a bit less friction to using proper objects than with Perl.


Posted by Quinn Weaver
Jan. 16, 2016, 11:06 p.m.

Blocked by rdsadmin

One of our clients on RDS had a VACUUM FREEZE that was hanging for a long time. "I bet it's waiting on a lock," I thought. Yup, but the curious part is what it was waiting on: a mysterious process run by the rdsadmin role (see RECORD 2, below):


SELECT pg_stat_activity, pg_locks.mode
FROM pg_stat_activity
JOIN pg_locks USING (pid)
JOIN pg_class ON pg_locks.relation = pg_class.oid
WHERE pg_class.relname = 'users'
AND pg_locks.mode IN ('ShareUpdateExclusiveLock', 'ShareLock', 'ShareRowExclusiveLock', 'ExclusiveLock', 'AccessExclusiveLock');

-[ RECORD 1 ]----+------------------------------
datid | 1234
datname | my_database
pid | 14641
usesysid | 16396
usename | postgres
application_name | psql
client_addr | 198.162.0.0
client_hostname |
client_port | 5430
backend_start | 2016-01-15 22:05:06.161987+00
xact_start | 2016-01-15 22:14:39.301425+00
query_start | 2016-01-15 22:14:39.301425+00
state_change | 2016-01-15 22:14:39.301429+00
waiting | t
state | active
query | VACUUM FREEZE verbose users;
mode | ShareUpdateExclusiveLock
-[ RECORD 2 ]----+------------------------------
datid | 1234
datname | my_database
pid | 22328
usesysid | 10
usename | rdsadmin
application_name |
client_addr |
client_hostname |
client_port |
backend_start |
xact_start |
query_start |
state_change |
waiting |
state |
query |
mode | ShareUpdateExclusiveLock
Further examination showed that this process was locking only the users table (and its indexes):

SELECT locktype, relation::regclass AS tablename
FROM pg_locks
JOIN pg_stat_activity USING (pid)
WHERE pid = 22328;

locktype | tablename
------------+---------------------------------------------
relation | user_index_a
relation | user_index_b
relation | user_index_c
relation | users_pkey
virtualxid |
relation | users
(13 rows)

Has anyone else seen such a process? I'm curious as to what it is. My current best guess is a vacuum We opened a ticket with Amazon to ask them, so I'll update this post when Amazon replies.

EDIT 2016-01-17: the above is a braino for "an autovacuum." An Amazon rep wrote back the next day to say that it was, indeed, an autovacuum process, and included some vacuum-tuning tips. So good on them, although it's unfortunate that RDS's privilege system doesn't allow you to see the pg_stat_activity.query field when the rdsadmin role is working on a table you own.


Posted by Quinn Weaver
Oct. 6, 2015, 8:19 p.m.

Reflections on (Humans) Trusting (Humans') Trust

One thing I've learned from consulting is that you should trust a competent person's feels about a technology, even if they can't immediately come up with arguments to support them. Seemingly vague statements like "Oh, I hate the Foo webserver, it's so flaky" or "Bar is a really annoying language to work in" or "the Baz API just doesn't feel very user-friendly" are actually the distilled product of many hard-won experiences that the speaker may not be able to call to mind individually. I observed this last night at Larry Wall's Perl 6 talk, specifically the Q&A: most of the time he came up with great examples, but sometimes he just had to talk about his own language in broad, esthetic terms. And I'm no Larry Wall, but today I found myself in a similar position when advising a client about which of two virtualization platforms to use. Of course, this applies to people speaking from experience and not from prejudice; you have to know how good someone's judgment is in order to transitively-trust them to have good opinions. That's one soft skill that gives you a major edge in technology. But, contrary to stereotype, most technical people are good at making these kinds of meta-judgments.


Posted by Quinn Weaver
June 9, 2015, 1:08 p.m.

Baidu Bot Blues

A client complained of mysterious connection spikes. "We think we're getting hit by bots, but we're not sure," they said; "Can you look into it?"

So I edited postgresql.conf, turned on log_connections and log_disconnections, did a 'pg_ctl -D MY_DATA_DIR reload', and waited (in my case I also set log_min_duration_statment = 0, but I don't recommend that unless you can watch to make sure the logs aren't filling the log partition — and make sure you're not going to hammer a slow log device).

A while later, I did

ls -t data/pg_log/*.csv | head -5 | xargs grep -c connection

(That says "List log files in order from most recent to least recent, take the five most recent, and count the lines in each file containing 'connection'.")

I got lucky. A spike was obviously in progress; the current log file showed a count two orders of magnitude higher than the other log files'. Since the problem was in progress, no post mortem was required; I just mailed the client and asked "Do your web logs show bot traffic right now?"

The client mailed right back saying, "Yes, we see a bunch of connections from Baidu's web crawler; it's going nuts hitting our site."

Baidu is a Chinese search engine. It turns out its crawling bot is well-known for being aggressive. As that link explains, the bot doesn't respect Crawl-delay in robots.txt (that's annoying but unsurprising, since Crawl-delay is a non-standard extension).

So if this happens to you, you have these options:

  1. Register at Baidu's site (Chinese-only) to request less-aggressive crawling.
  2. Try to tune PostgreSQL connection pooling to handle the onslaught.
  3. Block Baidu entirely. This is an extreme option — not advised if you do business in China, or if you have Chinese-speaking customers elsewhere who are reaching you through Baidu. The latter requires analytics to figure out.


Posted by Quinn Weaver
May 8, 2015, 2:37 p.m.

syslog Considered Dangerous

Edit 2015-10-01: there's a similar problem on RDS, where logs are written to the same partition as data, which can cause problems if you don't have enough provisioned IOPS (or disk space).

A client of ours was suffering mysterious, intermittent lock pileups on a large table. To diagnose the problem, I ran our lock-logging scripts.

Querying the resulting log_transaction_locks table, I saw lots of extend locks and RowExclusiveLocks piling up behind normal UPDATEs (and INSERTs) — there were even extend locks waiting on other extend locks. This is pretty unusual; you'd expect that when one UPDATE completed, the extend lock would succeed, PostgreSQL would quickly extend the relation (in this case, the big table's TOAST relation), and queries would keep running smoothly. These were all single-row writes; they should not have taken that long. However, a complicating factor was that a couple of columns in that big table were giant TEXT columns storing JSON strings. Was that the issue? The problem was unclear, and we needed more info.

At this point my colleague Christophe Pettus made the excellent suggestion to do 'strace -p' on one of the locked processes. The dump revealed lots of sendto() calls with giant strings, which he immediately recognized as syslog messages based on their formatting.

Now, this was a problem. syslog writes to /var, and /var was hanging off the root partition, which was an EBS volume (without provisioned IOPS). So each basically the entire SQL of each query was being written across the network, including those giant JSON strings, and the query itself wouldn't terminate till the write was fsync()'ed. In the meantime, extend locks and other UPDATEs piled up behind it. The PostgreSQL data directory was on a big, fast software RAID across SSD instance storage, but performance was being bottlenecked by that single EBS volume.

The client elected to turn off log_lock_waits, and the problem went away. In addition, Christophe suggested prepending a minus sign to the filename in /etc/rsyslog.d/postgres.conf, like so, which tells syslog not to flush between each log statement:

local0.* -/var/log/postgres.log

I thought that was neat: a potential fix to a huge performance problem, in one character(!)

Lessons learned:

1. If your log_destination contains syslog, be careful about filesystem layout. (If you stick to csvlog, this normally isn't a problem, because csvlogs, by default, go in pg_log/ under the PostgreSQL data dir.)

2. Logging info about a performance problem can sometimes make it worse. In this case, we decided to diagnose the pre-existing lock pileups by setting log_lock_waits = on and deadlock_timeout = 200. But, because of the syslog issue, those very settings exacerbated the lock problems. This was necessary to catch the problem, but it's something to keep in mind if you decide to use our lock-logging scripts.


Posted by Quinn Weaver
Sept. 17, 2014, 1:18 p.m.

RDS for Postgres: List of Supported Extensions

Today I learned that Amazon doesn't keep any list of extensions supported in PostgreSQL. Instead, their documentation tells you to start a psql session and run 'SHOW rds.extensions'. But that creates a chicken-and-egg situation if you have an app that needs extensions, and you're trying to decide whether to migrate.

Update 2018-10-18: here's the list for RDS PostgreSQL 10.4 (in ASCIIbetical order). I'm pleased to see they added hll!

address_standardizer
address_standardizer_data_us
amcheck
bloom
btree_gin
btree_gist
chkpass
citext
cube
dblink
dict_int
dict_xsyn
earthdistance
fuzzystrmatch
hll
hstore
hstore_plperl
intagg
intarray
ip4r
isn
log_fdw
ltree
orafce
pageinspect
pg_buffercache
pg_freespacemap
pg_hint_plan
pg_prewarm
pg_repack
pg_similarity
pg_stat_statements
pg_trgm
pg_visibility
pgaudit
pgcrypto
pglogical
pgrouting
pgrowlocks
pgstattuple
plcoffee
plls
plperl
plpgsql
pltcl
plv8
postgis
postgis_tiger_geocoder
postgis_topology
postgres_fdw
prefix
sslinfo
tablefunc
test_parser
tsearch2
tsm_system_rows
tsm_system_time
unaccent
uuid-ossp

So here's a list of extensions supported as of today, 2014-09-17 (RDS PostgreSQL 9.3.3). I'll try to keep this current.

btree_gin
btree_gist
chkpass
citext
cube
dblink
dict_int
dict_xsyn
earthdistance
fuzzystrmatch
hstore
intagg
intarray
isn
ltree
pgcrypto
pgrowlocks
pg_trgm
plperl
plpgsql
pltcl
postgis
postgis_tiger_geocoder
postgis_topology
sslinfo
tablefunc
tsearch2
unaccent
uuid-ossp

btree_gin
btree_gist
chkpass
citext
cube
dblink
dict_int
dict_xsyn
earthdistance
fuzzystrmatch
hstore
intagg
intarray
isn
ltree
pgcrypto
pgrowlocks
pg_trgm
plperl
plpgsql
pltcl
postgis
postgis_tiger_geocoder
postgis_topology
sslinfo
tablefunc
tsearch2
unaccent
uuid-ossp


Posted by Quinn Weaver
April 2, 2014, 4:02 p.m.

Putting stats_temp_directory on a ramdisk

This hack is an old chestnut among PostgreSQL performance tuners, but it doesn't seem to be widely known elsewhere. That's a shame, because it's pure win, and it's ridiculously easy to set up. You don't even need to restart PostgreSQL.

Here's the situation: PostgreSQL writes certain temporary statistics. These go in the dir given by the stats_temp_directory setting. By default, that's pg_stat_tmp in the data dir. Temp files get written a lot, but there's no need for them to persist.

That makes them perfect candidates for a ramdisk (a.k.a. RAM drive). A ramdisk is a chunk of memory treated as a block device by the OS. Because it's RAM, it's super-fast. As far as the app is concerned, the ramdisk just holds a filesystem that it can read and write like any other. Moreover, PostgreSQL generally only needs a few hundred kilobytes for stats_temp_directory; any modern server can fit that in RAM.

In Linux, you set up a ramdisk like this:

As root:

'mkdir /var/lib/pgsql_stats_tmp' [1]

'chmod 777 /var/lib/pgsql_stats_tmp'

'chmod +t /var/lib/pgsql_stats_tmp'

Add this line to /etc/fstab. That 2G is an upper limit; the system will use only as much as it needs.

tmpfs /var/lib/pgsql_stats_tmp tmpfs size=2G,uid=postgres,gid=postgres 0 0

'mount /var/lib/pgsql_stats_tmp'

Then, as postgres:

Change the stats_temp_directory setting in postgresql.conf:

stats_temp_directory = '/var/lib/pgsql_stats_tmp'

Tell PostgreSQL to re-read its configuration:

'pg_ctl -D YOUR_DATA_DIR reload'

And that's it!

Other operating systems have different ways to set up ramdisks. Perhaps I'll cover them in a later post.

[1] The directory /var/lib/pgsql_stats_tmp is an arbitrary choice, but it works well for Debian's filesystem layout.


Posted by Quinn Weaver
Aug. 22, 2013, 3:57 p.m.

mysql_auto_reconnect

I prefer PostgreSQL, which is the core of our business at pgExperts. But I still have some legacy clients on MySQL. For one of them, I'm writing new features in Perl/MySQL. The mysql_auto_reconnect option just saved my day. I'd been getting errors like "DBD::mysql::st execute failed: MySQL server has gone away." To use it in DBI, do


my $dbh = DBI->connect(
"DBI:mysql:database=${MY_DATABASE};host=${MY_HOST};port=${MY_PORT}",
$MY_USER,
$MY_PASSWORD,
{ mysql_auto_reconnect => 1 },
);
To use it in Rose::DB::Object, do

File MyApp::DB;
our @ISA = qw( Rose::DB );

__PACKAGE__->default_connect_options({
…,
mysql_auto_reconnect => 1,
});


Posted by Quinn Weaver
Feb. 18, 2013, 4:30 p.m.

Python Type Gripe

I've been digging into Python lately. On Friday, I ran into an unintuitive behavior where two sets containing identical objects (of a custom class) were comparing unequal. I bet you can guess the reason: I hadn't defined __hash__ and __eq__ for my class. Take a look at this documentation for __repr__:

For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval(), otherwise the representation is a string enclosed in angle brackets that contains the name of the type of the object together with additional information often including the name and address of the object. A class can control what this function returns for its instances by defining a __repr__() method.
"For many?" "makes an attempt?!" "often including?!!" To my way of thinking, this is a little crazy. There should be a single root class in the inheritance hierarchy, and that root should define __repr__, __hash__, and __eq__ methods that operate on/check for equality of each attribute in the class (by value, not by memory address!) Then they would behave consistently for all classes. Then two sets, each containing three objects of the same class, each containing the same attributes with the same values, would compare equal when you used == on them, following the principle of least surprise.

Of course, I can't rewrite Python to make this happen. I'm tempted just to make my own root class that implements these behaviors via metaprogramming, and make every class I ever define a subclass of it. Is there a better way?

PS: A colleague pointed out that this behavior reflects a deliberate decision in Python design philosophy: "Python […] favors informal protocols over inheritance trees […] You can think of __eq__ and __hash__ as kind of being parts of an informal 'collectable' protocol." That sort of makes sense. Of course, the thing about informal protocols is that there's nothing to enforce consistency.


Posted by Quinn Weaver
Feb. 1, 2013, 3:12 p.m.

Why PostgreSQL Doesn't Allow Unsafe Implicit Type Coercions

I got bitten by a MySQL bug today, and it reminded me why we don't allow¹ unsafe type coercions in PostgreSQL. It's easy to talk about them in the abstract, but harder to explain why they're a problem in practice. Here's a really clear example.

I do most of my consulting through pgExperts these days, but I still have a legacy client on MySQL. Today I was writing a new feature for them. Using test-driven development, I was debugging a particular invocation of a five-way join. I found it was returning many rows from a certain right-hand table, when it should have been returning none.

I took a closer look at my SQL and found this code²:



WHERE order.status_id NOT IN ('completed', 'cancelled')

But that's not right! status_id isn't a varchar. It's an int FK to a lookup table:


CREATE TABLE statuses (
id int(10) unsigned NOT NULL AUTO_INCREMENT,
name varchar(16) NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY name (name)
) ENGINE=InnoDB;

MySQL was silently coercing my integer order.status_id to a varchar, comparing it to 'completed' and 'cancelled' (which of course it could never match), and returning a bunch of right-hand table rows as a result. The filter was broken.

PostgreSQL would never haved allowed this. It would have complained about the type mismatch between status_is and 'completed' or 'cancelled'. Rather than a confusing resultset, I'd get a clear error message at SQL compile time:


ERROR: invalid input syntax for integer: "cancelled"
LINE 8: select * from frobs where order.status_id NOT IN ('cancelled', 'c...
^

Fortunately, I caught and fixed this quickly (hurrah for test-driven development!) The fix was simple:



JOIN statuses stat on stat.id = order.status_id
WHERE stat.name NOT IN ('completed', 'cancelled')

But under different circumstances this could have seriously burned me.

And that's why we disallow certain implicit type coercions. This post is not intended as a rant against MySQL. Rather, it's an illustration of how a seemingly abstract issue can seriously hose your code. Caveat hacker!


1. Since PostgreSQL 8.3 (2008), that is. Before that, we did allow unsafe coercions.

2. I've changed the code and identifiers to protect client confidentiality.


Posted by Quinn Weaver