Test server strategies
Posted on Thu 10 April 2008 in misc
Occasionally on the #OpenILS-Evergreen IRC channel, a question comes up what kind of hardware a site should buy if they're getting serious about trying out Evergreen. I had exactly the same chat with Mike Rylander back in December, so I thought it might be useful to share the strategy we developed in case other organizations are interested in piggy-backing on our research. We came up with three different scenarios, depending on the funding available to the organization and how serious the organization is about testing, developing, and deploying Evergreen.
You can also look at the scenarios as stages, as the scenarios enable
progressively more realistic testing. An organization can always
start with a single server and add more servers over time; if you can
swing a significant discount for buying in bulk, however, it might
make sense to bite the bullet early.
Some pertinent facts about our requirements: we will eventually be loading around 5 million bibliographic records onto the system. We're an academic organization, so concurrent searching and circulation loads will be low relative to public libraries.
Scenario 1: A single bargain-basement testing server
In this scenario, the organization purchases a single server for the short
term, and configures it to run the entire Evergreen + OpenSRF stack:
- database
- Web server
- Jabber messaging
- memcached
- OpenSRF applications </p>
This server needs to have powerful CPUs, large amounts of RAM, and many fast (10K RPM or higher) hard drives in a
striped RAID configuration (the latter because database performance
typically gets knee-capped by disk access). A "higher education" quote online from a reputable big-name vendor for a rack-mounted 2U database server with 2x4-core
CPU, 16GB RAM, 6x73GB RAID 5 drives comes in at approximately $7000.
This scenario is fine for development and testing with a limited
number of users, but if you intend to do any sort of stress testing
with this server or throw it open to the public, performance will
likely grind to a halt. Note: This is close to the system that we're currently running at http://biblio-dev.laurentian.ca - 12 GB of RAM, 2 dual-core CPUs - with 800K bibliographic records and pretty snappy search performance. It's certainly nothing to sneeze at.
Scenario 2: one database server, one network server
In this scenario, you purchase a database server and a network server.
We'll use the same specs from scenario 1 for the database server, and
a CPU + RAM-oriented server for the network server (disk access isn't
a factor for the network apps, so you just buy two small mirrored
drives). The stock higher education quote for a rack-mounted 1U
network server with 2x4-core CPU, 16GB RAM, 2x73GB RAID 1 drives is
approximately $5250.
This scenario will support development and testing, as well as enable
you perform relatively representative stress testing runs with a
significant number of simultaneous users.
Scenario 3: two database servers, two or three network servers
In this scenario, you purchase two database servers so that you can test
database replication, split database loads between search and reporting, and two or three network servers to test
different distributions of the caching and network apps across the servers to
determine the configuration that best meets your expected demands. The cost of the five servers adds up to less than $30,000 - less than a single traditional proprietary UNIX server - and would be less if you can negotiate a bulk discount.
The third scenario supports development and testing, and will give you
practical experience with a configuration that would approximate your
production deployment of servers. When you go live, you could move one of the database servers
and all but one of the network servers over to the production cluster, and revert back to scenario one for your ongoing test and development environment.
The Conifer approach
We opted to go with the third scenario to build a serious test cluster for our consortium. However, the "scenarios as stages" approach ended up being our strategy as our original choice of Dell servers came with RAID controllers that do not work well under Debian. After returning the servers to Dell, we were forced to press one of our backup servers into service as a scenario-one style server while waiting for our new order from HP to arrive.