<?xml version="1.0" encoding="utf-8" ?>

<rss version="2.0" 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:admin="http://webns.net/mvcb/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
   xmlns:wfw="http://wellformedweb.org/CommentAPI/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   >
<channel>
    
    <title>Coffee|Code : Dan Scott - PostgreSQL</title>
    <link>http://coffeecode.net/</link>
    <description>Caffeinated Librarian Geek</description>
    <dc:language>en</dc:language>
    <generator>Serendipity 1.6.2 - http://www.s9y.org/</generator>
    
    

<item>
    <title>Introducing SQL to Evergreen administrators, round two</title>
    <link>http://coffeecode.net/archives/263-Introducing-SQL-to-Evergreen-administrators,-round-two.html</link>
            <category>Evergreen</category>
            <category>PostgreSQL</category>
    
    <comments>http://coffeecode.net/archives/263-Introducing-SQL-to-Evergreen-administrators,-round-two.html#comments</comments>
    <wfw:comment>http://coffeecode.net/wfwcomment.php?cid=263</wfw:comment>

    <slash:comments>1</slash:comments>
    <wfw:commentRss>http://coffeecode.net/rss.php?version=2.0&amp;type=comments&amp;cid=263</wfw:commentRss>
    

    <author>dan@coffeecode.net (Dan Scott)</author>
    <content:encoded>
    &lt;p&gt;
&lt;a href=&quot;http://coffeecode.net/archives/212-Introduction-to-SQL-for-Evergreen-administrators.html&quot;&gt;Three years ago&lt;/a&gt; I was asked to create and deliver a two-day course introducing
SQL to Evergreen users. Things went well and I was able to share the resulting
materials with the Evergreen and PostgreSQL community. Perhaps one of my
happiest moments at the Evergreen conference last year was when one of the
participants in that course, told me that many of his fellow participants were
still successfully writing SQL queries and getting work done.  Huzzah!
&lt;/p&gt;
&lt;p&gt;
Time goes by and another group, &lt;a href=&quot;http://www.ohionet.org&quot;&gt;OHIONET&lt;/a&gt;, was running into difficulties getting
started with PostgreSQL and Evergreen. They asked me if I would be
willing to give the same sort of training I had given a few years back. &quot;Sure&quot;,
I said, thinking it would be a great opportunity to polish the materials and
add some updates to cover new features in PostgreSQL and Evergreen. We also
opted to skip the travel and do an entirely virtual training session via
Google Hangouts, which worked out rather nicely (but that&#039;s a different story).
&lt;/p&gt;
&lt;p&gt;
As it turned out, I probably ended up putting about four days worth of effort
(crammed into lots of late nights, weekends, and vacation days) into
overhauling the instruction materials. But the results were worth it, in my
opinion; I&#039;m rather proud of the content, and while I believe it stands up on
its own, the guidance that I was able to provide during the live instruction
sessions was well-received by the participants.
&lt;/p&gt;
&lt;p&gt;
Thus, I am pleased to be able to offer to the broader community the latest
version of the Introduction to SQL for Evergreen Administrators, under a
Creative Commons Attribution-ShareAlike 3.0 (Unported) license.
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Reference documentation--30 pages introducing SQL with examples drawn from the Evergreen schema:
(&lt;a href=&quot;http://bzr.coffeecode.net/intro_to_sql/v2/introduction_to_sql.html&quot;&gt;HTML&lt;/a&gt;)
(&lt;a href=&quot;http://bzr.coffeecode.net/intro_to_sql/v2/introduction_to_sql.pdf&quot;&gt;PDF&lt;/a&gt;)
(&lt;a href=&quot;http://bzr.coffeecode.net/intro_to_sql/v2/introduction_to_sql.epub&quot;&gt;ePub&lt;/a&gt;)
(&lt;a href=&quot;http://bzr.coffeecode.net/intro_to_sql/introduction_to_sql.epub&quot;&gt;AsciiDoc&lt;/a&gt;)
&lt;/li&gt;
&lt;li&gt;Presentation:
(&lt;a href=&quot;http://bzr.coffeecode.net/intro_to_sql/SQL_instruction.odp&quot;&gt;LibreOffice Impress&lt;/a&gt;)
(&lt;a href=&quot;http://bzr.coffeecode.net/intro_to_sql/v2/SQL_instruction.pdf&quot;&gt;PDF&lt;/a&gt;)
&lt;/li&gt;
&lt;li&gt;Solutions to exercises:
(&lt;a href=&quot;http://bzr.coffeecode.net/intro_to_sql/solutions_day_1.txt&quot;&gt;Day 1&lt;/a&gt;)
(&lt;a href=&quot;http://bzr.coffeecode.net/intro_to_sql/solutions_day_2.txt&quot;&gt;Day 2&lt;/a&gt;)
&lt;/ul&gt;
&lt;p&gt;
So, a huge thanks to OHIONET for giving me the impetus to overhaul this
material, and for giving me a chance to introduce them to the wonders of
SQL with PostgreSQL, and to the inner workings of the Evergreen schema. It
was a blast! And thanks for agreeing to let me share these materials with the
broader community.
&lt;/p&gt; 
    </content:encoded>

    <pubDate>Fri, 15 Feb 2013 21:32:08 -0500</pubDate>
    <guid isPermaLink="false">http://coffeecode.net/archives/263-guid.html</guid>
    <category>evergreen</category>
<category>postgresql</category>

</item>
<item>
    <title>Triumph of the tiny brain: Dan vs. Drupal / Panels</title>
    <link>http://coffeecode.net/archives/261-Triumph-of-the-tiny-brain-Dan-vs.-Drupal-Panels.html</link>
            <category>Coding</category>
            <category>Perl</category>
            <category>PostgreSQL</category>
    
    <comments>http://coffeecode.net/archives/261-Triumph-of-the-tiny-brain-Dan-vs.-Drupal-Panels.html#comments</comments>
    <wfw:comment>http://coffeecode.net/wfwcomment.php?cid=261</wfw:comment>

    <slash:comments>2</slash:comments>
    <wfw:commentRss>http://coffeecode.net/rss.php?version=2.0&amp;type=comments&amp;cid=261</wfw:commentRss>
    

    <author>dan@coffeecode.net (Dan Scott)</author>
    <content:encoded>
    &lt;p&gt;A while ago I inherited responsibility for a Drupal 6 instance and a rather out-of-date server. (You know it&#039;s not good when your production operating system is so old that it is no longer getting security updates).&lt;/p&gt;
&lt;p&gt;I&#039;m not a Drupal person. I dabbled with Drupal years and years ago when I was heavily into PHP, but it never stuck with me. Every time I poked around at the database schema, with serialized objects stuck inside columns, I found something else that I wanted to work on instead. Thus, inheriting a Drupal instance wasn&#039;t something I had been looking forward to. As this production server was running a number of different services that were in use by our library, I went through a number of trial runs to ensure that the base packages wouldn&#039;t introduce regressions or outages. Fast-forward past a reasonably successful early-morning upgrade from Debian Lenny to Squeeze and I was able to start looking at addressing the Drupal instance that was also approximately 18 months out of date.&lt;/p&gt;
&lt;p&gt;Initially, after I worked out the how-to of Drupal upgrades (in short: upgrade just Drupal core, then upgrade the modules), I thought all was well. I even got over the hump of realizing that our instance had had all of the modules dumped into Drupal&#039;s core directory, rather than &lt;tt&gt;sites/all/modules&lt;/tt&gt;, and (even more impressively) got over the problem that the core &lt;em&gt;bluemarine&lt;/em&gt; them had been hacked directly rather than having been separated out into a new custom theme. After working through those learning pains, I realized that somewhere in all of the Drupal and module upgrades, that something got &quot;more secure&quot; and started truncating IMG links to files with spaces in them at the first space. So &quot;foo%20bar.jpg&quot; was becoming &quot;foo.jpg&quot; and we were getting 404s everywhere.&lt;/p&gt;
&lt;p&gt;Did I mention that I didn&#039;t notice this until I upgraded our production instance? Oh yes, I went through iteration after iteration of upgrades on the test server, and dutifully fixed up the problems that I found in the subset of content that I was testing against. I discovered and fixed problems like the production server content linking directly to the test server (slight copy-and-paste errors on the part of the content creators, I suppose). But I didn&#039;t notice all of the 404s, because who uploads images with spaces in their filename?&lt;/p&gt;
&lt;p&gt;Turns out, everyone else in my library does that. Of course! And from what I was able to piece together via Google and browsing drupal.org, there was supposed to be some sanitization of the incoming filenames so that spaces would be normalized, etc. But either that wasn&#039;t introduced until well after our content had been created, or my predecessor had lightly hacked one of the modules, or Drupal itself, and hadn&#039;t bothered to use a source code repository to track those customizations. So, realizing that I needed to make some bulk changes, I went at it with a two-step plan:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Create symbolic links for both the truncated filename and the spaces-normalized-to-underscores filenames. Creating symlinks for the truncated filenames would fix the 404s immediately, at the cost of some clash in the intended targets; there were plenty of &lt;tt&gt;Foo illustration.JPG&lt;/tt&gt; and &lt;tt&gt;Foo info.JPG&lt;/tt&gt; pairs of files, but like the Highlander, there can be only on &lt;tt&gt;Foo.JPG&lt;/tt&gt;.&lt;/li&gt;
&lt;li&gt;Munge the database entries so that all of those now apparently insecure %20-containing filenames would become underscores.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;If you&#039;re a Drupal user or a Drupal with Panels module user, you might know that the database schema suffers from some fairly horrible tricks being played on it. In this case, the Panels module creates a &lt;tt&gt;panels_pane&lt;/tt&gt; table with a &lt;tt&gt;configuration&lt;/tt&gt; TEXT column. Based on the name alone, it might seem odd that column is used to store the HTML content of the corresponding panel. Even odder to me is that this is not just a TEXT column, it&#039;s a column that expects a very particular structure - something like:&lt;/p&gt;
&lt;pre&gt;a:5:{s:11:&quot;admin_title&quot;;s:5:&quot;RACER&quot;;s:5:&quot;title&quot;;s:0:&quot;&quot;;s:4:&quot;body&quot;;s:639:&quot;&amp;lt;p&gt;&amp;lt;img width=&quot;225&quot; height [...]}&lt;/pre&gt;
&lt;p&gt;Ah, nothing like storing an object within a single database column. Of particular interest was the result that I had when I tested updating the column value with a basic &quot;replace(configuration, &#039;%20&#039;, &#039;_&#039;)&quot; - the panel showed only &lt;b&gt;n/a&lt;/b&gt;, presumably because the size (defined by the &lt;tt&gt;s&lt;/tt&gt; properties in the object) for the &quot;body&quot; text property was no longer a match. That would be an instance of http://drupal.org/node/926448 - so okay, clearly I had to change tactics and update the entire object.&lt;/p&gt;
&lt;p&gt;I tried quickly finding the Drupal way to do this: clearly there&#039;s an API and there must be some simple way to retrieve an object, change it&#039;s values, and update it so that the serialized object gets stored in the database and Drupal is happy. However, I couldn&#039;t find a simple tutorial, and trying #drupal on Freenode was unfortunately fruitless as well (although some people did try to suggest running REPLACE() at the database level, that was nice but they didn&#039;t recognize that that would actually damage things significantly).&lt;/p&gt;
&lt;p&gt;So... out came the Perl, and here&#039;s what I hacked together:&lt;/p&gt;
&lt;pre&gt;
#!/bin/perl
use strict;
use warnings;

foreach (&amp;lt;DATA&gt;) {
    chomp();
    my $i = 0;
    my $body = 0;
    my @fixed;
    my @row = split /\t/;
    my $pid = $row[1];
    my $configuration = $row[0];
    my @chunks = split /&quot;;s:/, $configuration;
    foreach my $chunk (@chunks) {
        if (!$i++) {
            push @fixed, $chunk;
            next;
        }
        if ($chunk =~ m/&quot;body/) {
            $body = 1;
            push @fixed, $chunk;
            next;
        }
        if ($body) {
            my ($length, $content) = $chunk =~ m/^(\d+):&quot;(.+)$/;
			for (my $j = 0; $j &lt; 50; $j++) {
            $content =~ s{(/pictures/[^\./]+?)%20}{$1_}g;
			}
            $content =~ s{%20}{+}g;
            $length = length($content);
            $chunk = &quot;$length:\&quot;$content&quot;;
            $body = 0;
        }
        push @fixed, $chunk;
    }
    print &#039;UPDATE panels_pane SET configuration = $ashes$&#039; . 
        join(&#039;&quot;;s:&#039;, @fixed) . &#039;$ashes$&#039; . &quot; WHERE pid = $pid;\n&quot;;
}
__DATA__
&lt;/pre&gt;
&lt;p&gt;Against the trusty database (I ? PostgreSQL!), I ran &lt;tt&gt;COPY (SELECT configuration, pid FROM panels_pane WHERE configuration ~ &#039;%20&#039;) TO &#039;conf_pids.out&#039;;&lt;/tt&gt;, then slapped the Perl code on top and generated a load of UPDATE statements. It&#039;s far from my best Perl code, but it worked and once I gave up on doing things the Drupal way I was able to put it together in a handful of minutes. I now have a functional Drupal 6 instance again, updated such that there are no known security vulnerabilities with either core or the modules we&#039;re using, and there are no broken image links.&lt;/p&gt;
&lt;p&gt;And now I need to begin working towards either grokking Drupal, or finding a content management system that my tiny brain can comprehend, because I don&#039;t want to have to go through these kinds of contortions again with future upgrades... Suggestions welcome!&lt;/p&gt; 
    </content:encoded>

    <pubDate>Thu, 18 Oct 2012 17:48:31 -0400</pubDate>
    <guid isPermaLink="false">http://coffeecode.net/archives/261-guid.html</guid>
    <category>coding</category>
<category>perl</category>
<category>postgresql</category>

</item>
<item>
    <title>Seek and ye shall find: full-text search in PostgreSQL</title>
    <link>http://coffeecode.net/archives/260-Seek-and-ye-shall-find-full-text-search-in-PostgreSQL.html</link>
            <category>PostgreSQL</category>
    
    <comments>http://coffeecode.net/archives/260-Seek-and-ye-shall-find-full-text-search-in-PostgreSQL.html#comments</comments>
    <wfw:comment>http://coffeecode.net/wfwcomment.php?cid=260</wfw:comment>

    <slash:comments>3</slash:comments>
    <wfw:commentRss>http://coffeecode.net/rss.php?version=2.0&amp;type=comments&amp;cid=260</wfw:commentRss>
    

    <author>dan@coffeecode.net (Dan Scott)</author>
    <content:encoded>
    &lt;p&gt;I&#039;m at &lt;a href=&quot;http://postgresopen.org/2012&quot;&gt;PostgresOpen&lt;/a&gt; in Chicago, and just gave my talk on &lt;a href=&quot;http://stuff.coffeecode.net/2012/pgopen_fulltext/pgsql-fulltext-intro.html&quot;&gt;Implementing full-text search in PostgreSQL&lt;/a&gt;. The goal was to give novice users the understanding and examples they needed to build a workable search solution using PostgreSQL&#039;s full-text search. And it went (in my opinion) well - an almost full room, lots of audience interaction (thanks Bruce Momjian, Jonathan Scott, Jonathan Katz, et al) a lot of nodding heads, and nobody running out of the room screaming. So... yay!&lt;/p&gt;
&lt;p&gt;A few takeaways from prepping for the presentation:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I suspect that some effort on making the full-text search parser extensible would go a long way towards resolving problems that you currently have to work around by manipulating the text before it hits the parser. For example, if you pass in a string like &lt;tt&gt;file/path&lt;/tt&gt;, PostgreSQL classifies that as a &lt;tt&gt;file&lt;/tt&gt; token and stores it as-is - but you might want to be able to search against either &quot;file&quot; or &quot;path&quot; as well as the concatenated form. Right now you have to preparse that string to break it up yourself (via regexp_replace() or the like), but it would be much nicer if you could teach the parser new tricks (without having to modify the source and recompile it).&lt;/li&gt;
&lt;li&gt;&lt;tt&gt;ts_headline()&lt;/tt&gt; might be a bottleneck for large documents - and (a) solution might be to just bust the document up. *Note to self*: dig into the underlying code to see if there&#039;s any chance of using indexes to enable improvement.&lt;/li&gt;
&lt;li&gt;Ran into a bug with &lt;tt&gt;ts_rewrite()&lt;/tt&gt; while building the tutorial, and have not yet worked out whether that was due to my local configuration or an actual bug... &lt;b&gt;TO DO!&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Also - PostgresOpen has had a great vibe so far; a relatively small but very high-quality conference with lots of knowledgeable, friendly participants. Selena (one of the organizers) had a goal of creating an environment similar to PgCon, and I would say from my limited experience attending one PgCon and one PostgresOpen that she and the rest of the conference team have done a great job!&lt;/p&gt; 
    </content:encoded>

    <pubDate>Tue, 18 Sep 2012 18:19:03 -0400</pubDate>
    <guid isPermaLink="false">http://coffeecode.net/archives/260-guid.html</guid>
    <category>postgresql</category>

</item>
<item>
    <title>Running libraries on PostgreSQL: PGCon 2012 talk</title>
    <link>http://coffeecode.net/archives/255-Running-libraries-on-PostgreSQL-PGCon-2012-talk.html</link>
            <category>Evergreen</category>
            <category>PostgreSQL</category>
    
    <comments>http://coffeecode.net/archives/255-Running-libraries-on-PostgreSQL-PGCon-2012-talk.html#comments</comments>
    <wfw:comment>http://coffeecode.net/wfwcomment.php?cid=255</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://coffeecode.net/rss.php?version=2.0&amp;type=comments&amp;cid=255</wfw:commentRss>
    

    <author>dan@coffeecode.net (Dan Scott)</author>
    <content:encoded>
    &lt;div id=&quot;content&quot;&gt;
&lt;div id=&quot;preamble&quot;&gt;
&lt;div class=&quot;sectionbody&quot;&gt;
&lt;div class=&quot;paragraph&quot;&gt;&lt;p&gt;On Friday, May 18th I gave &lt;a
href=&quot;http://www.pgcon.org/2012/schedule/events/465.en.html&quot;&gt;a talk at the
PGCon 2012 conference&lt;/a&gt; on the use of PostgreSQL by the Evergreen project. My
talk fell in the &lt;em&gt;case study&lt;/em&gt; track, which meant that I had been asked to
describe to PostgreSQL developers what Evergreen was, why it was a project they
might want to care about, enumerate the advantages that Evergreen gets from
using PostgreSQL, and where our project has some difficulties with PostgreSQL.&lt;/p&gt;&lt;/div&gt;
&lt;div class=&quot;paragraph&quot;&gt;&lt;p&gt;I have given a lot of talks before, but I&amp;#8217;m used to being on the developer
side of the discussion. In this case, the tables were turned; with noted
PostgreSQL contributors like Josh Berkus, Chris Brown, Simon Riggs, and Robert
Treat in the audience, I was a user talking to the developers of something that
I was very much dependent on and which I understood at a much more basic level
than they did. This was both liberating &lt;em&gt;and&lt;/em&gt; humbling; it definitely adds some
perspective to my experiences as a developer in the Evergreen project.&lt;/p&gt;&lt;/div&gt;
&lt;div class=&quot;paragraph&quot;&gt;&lt;p&gt;Along with my slides, the whole talk has been professionally recorded - both
video and audio - thanks to Heroku&amp;#8217;s sponsorship, so you will be able to relive
each and every word if you really want to. I&amp;#8217;ll summarize the main points that
I wanted to convey to the PostgreSQL developers:&lt;/p&gt;&lt;/div&gt;
&lt;div class=&quot;ulist&quot;&gt;&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;
I was quite candid that most libraries can&amp;#8217;t afford dedicated database
  administrators, and that therefore the more that PostgreSQL can provide
  reasonable out-of-the-box configuration settings, the better. For example,
  results from &lt;a href=&quot;http://evergreen-ils.org/~denials/postgresql_survey.html&quot;&gt;the survey that I sent out at the last minute&lt;/a&gt; (THANK YOU to the nine
  sites that responded!) showed many sites running with a default statistics
  target of 50, whereas the default had been increased to 100 back in
  PostgreSQL 8.1 and much higher settings are often recommended to help the
  planner make its decisions. That said, my survey didn&amp;#8217;t ask for table-level
  statistics settings (did you &lt;strong&gt;know&lt;/strong&gt; that you could change the statistics for
  particular tables?), so perhaps some sites are using higher statistics levels
  for particular tables and a lower default threshold.
&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;
It was probably hokey, but I noted that as libraries are often called the
  heart of their community, that PostgreSQL was effectively the heart of
  Evergreen&amp;#8201;&amp;#8212;&amp;#8201;and I invited the PostgreSQL community to help our heart beat
  faster. With the Evergreen Oversight Board contemplating a strategic
  investment fund for initiatives that will have a long-term benefit to
  Evergreen, this might be an avenue for getting PostgreSQL experts to assist
  us on areas that represent particular bottlenecks (beyond helping us out
  of the goodness of their own hearts). As well, I invited the PostgreSQL
  community to join in advocacy efforts to get their local libraries to
  consider adopting Evergreen.
&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;
I described, at a high-level, many of the PostgreSQL features that Evergreen
  relies on (full-text search, stored procedures, Hstore, inheritance) and
  tried to convey why our schema takes up 355 tables (and counting) to deal
  with what, from outside a library perspective, must seem like a relatively
  simple problem to deal with. And of course I gave most of the credit for
  Evergreen&amp;#8217;s PostgreSQL-savviness on multiple levels to Mike Rylander.
&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class=&quot;paragraph&quot;&gt;&lt;p&gt;The talk was well-received, based on a number of people who approached me
afterward to continue the discussion. Josh called it one of the first times he
had seen a presentation designed to solicit assistance directly from the
developers in attendance (I probably overplayed the &quot;help us poor harried
library system administrators&quot; hand) and thought that it hit the mark for a
case study; similarly, Simon was quite interested in Evergreen&amp;#8217;s adoption
patterns with (I suspect) an eye towards offering possible consulting in
administration and optimization efforts.&lt;/p&gt;&lt;/div&gt;
&lt;div class=&quot;paragraph&quot;&gt;&lt;p&gt;On the &quot;immediate takeaways&quot; from that talk:&lt;/p&gt;&lt;/div&gt;
&lt;div class=&quot;ulist&quot;&gt;&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;
For straightforward connection pooling, pgbouncer is the current
  recommendation over the more flexible but more complicated pgpool-II.
&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;
Recent versions of Slony have lifted limitations that bit us in the past, like the inability to
  replicate a TRUNCATE command.
&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;
Solr, as a potential alternative to PostgreSQL&amp;#8217;s full-text search, is seen as
  fast but brittle to manage, and adds in overhead to maintain consistency with
  the contents of the database. (I&amp;#8217;m not so sure about the brittleness, given
  Hathitrust&amp;#8217;s ability to run a massive Solr index, but it is worth following
  up on&amp;#8230;)
&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;
Streaming replication in 9.1 has improved significantly over 9.0, although
  you&amp;#8217;ll still want to have WAL archiving in case of disaster.
&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class=&quot;paragraph&quot;&gt;&lt;p&gt;I have a lot more to say about the intersection of the PostgreSQL and Evergreen
communities in general, but on the whole I think that a closer relationship has
been long overdue. I was delighted that Ben Shum and Robin Isard were both able
to attend the conference, and I firmly believe that building more PostgreSQL
development and administration expertise within the Evergreen community is
critical to our long-term success. While I have long been an advocate of
pointing community members to the documentation of the underlying
infrastructure components for specific administration recommendations, I
believe that effective PostgreSQL tuning and administration is so critical to
the successful implementation of a production Evergreen site that we should
add a section to the Evergreen documentation containing a small set of
considerations and/or processes for going into production&amp;#8212;and I hope to start
that relatively soon.&lt;/p&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt; 
    </content:encoded>

    <pubDate>Sun, 20 May 2012 13:57:45 -0400</pubDate>
    <guid isPermaLink="false">http://coffeecode.net/archives/255-guid.html</guid>
    <category>evergreen</category>
<category>postgresql</category>

</item>
<item>
    <title>Tuning PostgreSQL for Evergreen on a test server</title>
    <link>http://coffeecode.net/archives/156-Tuning-PostgreSQL-for-Evergreen-on-a-test-server.html</link>
            <category>Evergreen</category>
            <category>PostgreSQL</category>
    
    <comments>http://coffeecode.net/archives/156-Tuning-PostgreSQL-for-Evergreen-on-a-test-server.html#comments</comments>
    <wfw:comment>http://coffeecode.net/wfwcomment.php?cid=156</wfw:comment>

    <slash:comments>1</slash:comments>
    <wfw:commentRss>http://coffeecode.net/rss.php?version=2.0&amp;type=comments&amp;cid=156</wfw:commentRss>
    

    <author>dan@coffeecode.net (Dan Scott)</author>
    <content:encoded>
    &lt;p&gt;&lt;strong&gt;Update 2008-05-01&lt;/strong&gt;: Fixed a typo for sysctl: -a parameter simply shows all settings; -w parameter is needed to write the setting. Duh.&lt;/p&gt;
&lt;p&gt;
Once you have decided on and acquired your &lt;a href=&quot;http://www.coffeecode.net/archives/155-Test-server-strategies.html&quot;&gt;test hardware for Evergreen&lt;/a&gt;, you need to think about tuning your PostgreSQL database server. Once you start loading bibliographic records, you might notice that after 100,000 records or so that your search response times aren&#039;t too snappy. Don&#039;t snarl at Evergreen. By default, PostgreSQL ships with very conservative settings (something like machines with 256 MB of RAM!) so if you don&#039;t tune those settings you&#039;re getting a false representation of your system&#039;s capabilities.
&lt;/p&gt;
&lt;p&gt;
The &quot;right&quot; settings for PostgreSQL depend significantly on your hardware and deployment context, but in almost any circumstance you will want to bump up the settings from the delivered defaults. To give you an idea of what you need to consider, I thought I would share the settings that we&#039;re currently using on our Evergreen test server at Laurentian University. You might be able to use these as a starting point and adjust them accordingly once you&#039;ve run some representative load tests against your configuration. And it&#039;s useful documentation for me to fall back on in a few months, when all of this has escaped my grasp &lt;img src=&quot;http://coffeecode.net/templates/default/img/emoticons/smile.png&quot; alt=&quot;:-)&quot; style=&quot;display: inline; vertical-align: bottom;&quot; class=&quot;emoticon&quot; /&gt;
&lt;/p&gt;
&lt;h4&gt;The defaults (as shipped in Debian Etch)&lt;/h4&gt;
&lt;p&gt;The defaults in Debian Etch are quite conservative. Consider that our test server has 12GB of RAM. The default only allocates 1MB of RAM to work memory (which is critical for sorting performance) and only 8MB of RAM to shared buffers. Following are the defaults set in /etc/postgresql/8.1/main/postgresql.conf:&lt;/p&gt;
&lt;pre&gt;
# - Memory -

#shared_buffers = 1000                  # min 16 or max_connections*2, 8KB each
#temp_buffers = 1000                    # min 100, 8KB each
#max_prepared_transactions = 5          # can be 0 or more
# note: increasing max_prepared_transactions costs ~600 bytes of shared memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
#work_mem = 1024                        # min 64, size in KB
#maintenance_work_mem = 16384           # min 1024, size in KB
#max_stack_depth = 2048                 # min 100, size in KB

# - Free Space Map -

#max_fsm_pages = 20000                  # min max_fsm_relations*16, 6 bytes each
#max_fsm_relations = 1000               # min 100, ~70 bytes each
&lt;/pre&gt;
&lt;h4&gt;Our test server settings&lt;/h4&gt;
&lt;p&gt;Our test server has 12 GB of RAM. Assuming that the PostgreSQL defaults were set for a system with 1 GB of RAM, we should be able to multiply the memory-based settings by at least a factor of 12. We&#039;re a little bit more aggressive than that in our settings. Note, however, that this is a single-server install of Evergreen, so we&#039;re also running memcached, ejabberd, Apache, and all of the Evergreen services as well as the database - oh, and a test instance of an institutional repository, among other apps - so we&#039;re not nearly as aggressive as we would be in a dedicated PostgreSQL server configuration. Please note that I&#039;m making no claims that this is the optimal set of configuration values for PostgreSQL even on our own hardware!&lt;/p&gt;
&lt;pre&gt;
# shared_buffers: much of our performance depends on sorting, so we&#039;ll set it 100X the default
# some tuning guides suggest cranking this up to as much 30% of your available RAM
shared_buffers = 100000 # 8K * 100000 = ~ 0.8 GB

# work_mem: how much RAM each concurrent process is allowed to claim before swapping to disk
# your workload will probably have a large number of concurrent processes
work_mem=524288 # 512 MB

# max_fsm_pages: increased because PostgreSQL demanded it
max_fsm_pages = 200000
&lt;/pre&gt;
&lt;p&gt;After you change these settings, you will need to restart PostgreSQL to make the settings take effect.&lt;/p&gt;
&lt;h4&gt;Kernel tuning&lt;/h4&gt;
&lt;p&gt;In addition to PostgreSQL complaining about max_fsm_pages not being high enough, your operating system kernel defaults for SysV shared memory might not be high enough to support the amount of RAM PostgreSQL demands as a result of your modifications. In one of our test configurations, we had cranked up work_mem to 8GB; Debian complained about an insufficient SHMMAX setting, so we were able to adjust that by running the following command as root to set the kernel SHMMAX to 8GB (8*1024^2):&lt;/p&gt;
&lt;pre&gt;
sysctl -w kernel.shmmax=8589934592
&lt;/pre&gt;
&lt;p&gt;To make this setting sticky through reboots, you can simply modify /etc/sysctl.conf to include the following line:&lt;/p&gt;
&lt;pre&gt;
# Set SHMMAX to 8GB for PostgreSQL
#kernel.shmmax=8589934592
&lt;/pre&gt;
&lt;h4&gt;Other measures&lt;/h4&gt;
&lt;p&gt;
Debian Etch comes with PostgreSQL 8.1. The first version of PostgreSQL 8.1 was released in November 2005. That&#039;s a long time in computer years. Version 8.2, which was released less than a year later, &quot;adds many functionality and performance improvements&quot; (according to the &lt;a href=&quot;http://www.postgresql.org/docs/8.2/static/release-8-2.html&quot;&gt;release notes&lt;/a&gt;). If you&#039;re not getting the performance you expect from your hardware with Debian Etch, perhaps a &lt;a href=&quot; http://packages.debian.org/etch-backports/postgresql-8.2&quot;&gt;backport of PostgreSQL 8.2&lt;/a&gt; would help out.
&lt;/p&gt;
&lt;h4&gt;Further resources&lt;/h4&gt;
&lt;p&gt;This is just a shallow dip into PostgreSQL tuning for Evergreen - hopefully enough to alert you to some of the factors you need to consider if you&#039;re putting Evergreen into a serious testing environment or production environment. Here are a few places to dig deeper into the art of PostgreSQL tuning:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;PostgreSQL manual, resource consumption section of server configuration: &lt;a href=&quot;http://www.postgresql.org/docs/8.1/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY&quot;&gt;version 8.1&lt;/a&gt; and &lt;a href=&quot;http://www.postgresql.org/docs/8.2/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY&quot;&gt;version 8.2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;An annotated version of the 8.0 parameters with more explicit advice is available at &lt;a href=&quot;http://www.powerpostgresql.com/Downloads/annotated_conf_80.html&quot;&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Some good advice is buried about halfway down &lt;a href=&quot;http://cbbrowne.com/info/postgresql.html&quot;&gt;Christopher Browne&#039;s page&lt;/a&gt; under the heading &quot;Tuning PostgreSQL&quot;, along with links to further resources&lt;/li&gt;
&lt;li&gt;The &quot;Performance Whack-A-Mole&quot; presentation at  &lt;a href=&quot;http://www.powerpostgresql.com/Docs&quot;&gt;PowerPostgreSQL&lt;/a&gt; is a great tutorial for holistic system tuning&lt;/li&gt;
&lt;/ul&gt; 
    </content:encoded>

    <pubDate>Mon, 14 Apr 2008 14:48:19 -0400</pubDate>
    <guid isPermaLink="false">http://coffeecode.net/archives/156-guid.html</guid>
    
</item>

</channel>
</rss>