Update 2008-05-01: Fixed a typo for sysctl: -a parameter simply shows all settings; -w parameter is needed to write the setting. Duh.
Once you have decided on and acquired your test hardware for Evergreen, you need to think about tuning your PostgreSQL database server. Once you start loading bibliographic records, you might notice that after 100,000 records or so that your search response times aren't too snappy. Don't snarl at Evergreen. By default, PostgreSQL ships with very conservative settings (something like machines with 256 MB of RAM!) so if you don't tune those settings you're getting a false representation of your system's capabilities.
The "right" settings for PostgreSQL depend significantly on your hardware and deployment context, but in almost any circumstance you will want to bump up the settings from the delivered defaults. To give you an idea of what you need to consider, I thought I would share the settings that we're currently using on our Evergreen test server at Laurentian University. You might be able to use these as a starting point and adjust them accordingly once you've run some representative load tests against your configuration. And it's useful documentation for me to fall back on in a few months, when all of this has escaped my grasp
The defaults (as shipped in Debian Etch)
The defaults in Debian Etch are quite conservative. Consider that our test server has 12GB of RAM. The default only allocates 1MB of RAM to work memory (which is critical for sorting performance) and only 8MB of RAM to shared buffers. Following are the defaults set in /etc/postgresql/8.1/main/postgresql.conf:
# - Memory -#shared_buffers = 1000 # min 16 or max_connections*2, 8KB each#temp_buffers = 1000 # min 100, 8KB each#max_prepared_transactions = 5 # can be 0 or more# note: increasing max_prepared_transactions costs ~600 bytes of shared memory# per transaction slot, plus lock space (see max_locks_per_transaction).#work_mem = 1024 # min 64, size in KB#maintenance_work_mem = 16384 # min 1024, size in KB#max_stack_depth = 2048 # min 100, size in KB# - Free Space Map -#max_fsm_pages = 20000 # min max_fsm_relations*16, 6 bytes each#max_fsm_relations = 1000 # min 100, ~70 bytes each
Our test server settings
Our test server has 12 GB of RAM. Assuming that the PostgreSQL defaults were set for a system with 1 GB of RAM, we should be able to multiply the memory-based settings by at least a factor of 12. We're a little bit more aggressive than that in our settings. Note, however, that this is a single-server install of Evergreen, so we're also running memcached, ejabberd, Apache, and all of the Evergreen services as well as the database - oh, and a test instance of an institutional repository, among other apps - so we're not nearly as aggressive as we would be in a dedicated PostgreSQL server configuration. Please note that I'm making no claims that this is the optimal set of configuration values for PostgreSQL even on our own hardware!
# shared_buffers: much of our performance depends on sorting, so we'll set it 100X the default# some tuning guides suggest cranking this up to as much 30% of your available RAMshared_buffers = 100000 # 8K * 100000 = ~ 0.8 GB# work_mem: how much RAM each concurrent process is allowed to claim before swapping to disk# your workload will probably have a large number of concurrent processeswork_mem=524288 # 512 MB# max_fsm_pages: increased because PostgreSQL demanded itmax_fsm_pages = 200000
After you change these settings, you will need to restart PostgreSQL to make the settings take effect.
In addition to PostgreSQL complaining about max_fsm_pages not being high enough, your operating system kernel defaults for SysV shared memory might not be high enough to support the amount of RAM PostgreSQL demands as a result of your modifications. In one of our test configurations, we had cranked up work_mem to 8GB; Debian complained about an insufficient SHMMAX setting, so we were able to adjust that by running the following command as root to set the kernel SHMMAX to 8GB (8*1024^2):
sysctl -w kernel.shmmax=8589934592
To make this setting sticky through reboots, you can simply modify /etc/sysctl.conf to include the following line:
# Set SHMMAX to 8GB for PostgreSQL#kernel.shmmax=8589934592
Debian Etch comes with PostgreSQL 8.1. The first version of PostgreSQL 8.1 was released in November 2005. That's a long time in computer years. Version 8.2, which was released less than a year later, "adds many functionality and performance improvements" (according to the release notes). If you're not getting the performance you expect from your hardware with Debian Etch, perhaps a backport of PostgreSQL 8.2 would help out.
This is just a shallow dip into PostgreSQL tuning for Evergreen - hopefully enough to alert you to some of the factors you need to consider if you're putting Evergreen into a serious testing environment or production environment. Here are a few places to dig deeper into the art of PostgreSQL tuning:
- PostgreSQL manual, resource consumption section of server configuration: version 8.1 and version 8.2
- An annotated version of the 8.0 parameters with more explicit advice is available at ` <http://www.powerpostgresql.com/Downloads/annotated_conf_80.html>`__
- Some good advice is buried about halfway down Christopher Browne's page under the heading "Tuning PostgreSQL", along with links to further resources
- The "Performance Whack-A-Mole" presentation at PowerPostgreSQL is a great tutorial for holistic system tuning