Thursday, October 18. 2012Triumph of the tiny brain: Dan vs. Drupal / PanelsA while ago I inherited responsibility for a Drupal 6 instance and a rather out-of-date server. (You know it's not good when your production operating system is so old that it is no longer getting security updates). I'm not a Drupal person. I dabbled with Drupal years and years ago when I was heavily into PHP, but it never stuck with me. Every time I poked around at the database schema, with serialized objects stuck inside columns, I found something else that I wanted to work on instead. Thus, inheriting a Drupal instance wasn't something I had been looking forward to. As this production server was running a number of different services that were in use by our library, I went through a number of trial runs to ensure that the base packages wouldn't introduce regressions or outages. Fast-forward past a reasonably successful early-morning upgrade from Debian Lenny to Squeeze and I was able to start looking at addressing the Drupal instance that was also approximately 18 months out of date. Initially, after I worked out the how-to of Drupal upgrades (in short: upgrade just Drupal core, then upgrade the modules), I thought all was well. I even got over the hump of realizing that our instance had had all of the modules dumped into Drupal's core directory, rather than sites/all/modules, and (even more impressively) got over the problem that the core bluemarine them had been hacked directly rather than having been separated out into a new custom theme. After working through those learning pains, I realized that somewhere in all of the Drupal and module upgrades, that something got "more secure" and started truncating IMG links to files with spaces in them at the first space. So "foo%20bar.jpg" was becoming "foo.jpg" and we were getting 404s everywhere. Did I mention that I didn't notice this until I upgraded our production instance? Oh yes, I went through iteration after iteration of upgrades on the test server, and dutifully fixed up the problems that I found in the subset of content that I was testing against. I discovered and fixed problems like the production server content linking directly to the test server (slight copy-and-paste errors on the part of the content creators, I suppose). But I didn't notice all of the 404s, because who uploads images with spaces in their filename? Turns out, everyone else in my library does that. Of course! And from what I was able to piece together via Google and browsing drupal.org, there was supposed to be some sanitization of the incoming filenames so that spaces would be normalized, etc. But either that wasn't introduced until well after our content had been created, or my predecessor had lightly hacked one of the modules, or Drupal itself, and hadn't bothered to use a source code repository to track those customizations. So, realizing that I needed to make some bulk changes, I went at it with a two-step plan:
If you're a Drupal user or a Drupal with Panels module user, you might know that the database schema suffers from some fairly horrible tricks being played on it. In this case, the Panels module creates a panels_pane table with a configuration TEXT column. Based on the name alone, it might seem odd that column is used to store the HTML content of the corresponding panel. Even odder to me is that this is not just a TEXT column, it's a column that expects a very particular structure - something like: a:5:{s:11:"admin_title";s:5:"RACER";s:5:"title";s:0:"";s:4:"body";s:639:"<p><img width="225" height [...]}
Ah, nothing like storing an object within a single database column. Of particular interest was the result that I had when I tested updating the column value with a basic "replace(configuration, '%20', '_')" - the panel showed only n/a, presumably because the size (defined by the s properties in the object) for the "body" text property was no longer a match. That would be an instance of http://drupal.org/node/926448 - so okay, clearly I had to change tactics and update the entire object. I tried quickly finding the Drupal way to do this: clearly there's an API and there must be some simple way to retrieve an object, change it's values, and update it so that the serialized object gets stored in the database and Drupal is happy. However, I couldn't find a simple tutorial, and trying #drupal on Freenode was unfortunately fruitless as well (although some people did try to suggest running REPLACE() at the database level, that was nice but they didn't recognize that that would actually damage things significantly). So... out came the Perl, and here's what I hacked together:
#!/bin/perl
use strict;
use warnings;
foreach (<DATA>) {
chomp();
my $i = 0;
my $body = 0;
my @fixed;
my @row = split /\t/;
my $pid = $row[1];
my $configuration = $row[0];
my @chunks = split /";s:/, $configuration;
foreach my $chunk (@chunks) {
if (!$i++) {
push @fixed, $chunk;
next;
}
if ($chunk =~ m/"body/) {
$body = 1;
push @fixed, $chunk;
next;
}
if ($body) {
my ($length, $content) = $chunk =~ m/^(\d+):"(.+)$/;
for (my $j = 0; $j < 50; $j++) {
$content =~ s{(/pictures/[^\./]+?)%20}{$1_}g;
}
$content =~ s{%20}{+}g;
$length = length($content);
$chunk = "$length:\"$content";
$body = 0;
}
push @fixed, $chunk;
}
print 'UPDATE panels_pane SET configuration = $ashes$' .
join('";s:', @fixed) . '$ashes$' . " WHERE pid = $pid;\n";
}
__DATA__
Against the trusty database (I ? PostgreSQL!), I ran COPY (SELECT configuration, pid FROM panels_pane WHERE configuration ~ '%20') TO 'conf_pids.out';, then slapped the Perl code on top and generated a load of UPDATE statements. It's far from my best Perl code, but it worked and once I gave up on doing things the Drupal way I was able to put it together in a handful of minutes. I now have a functional Drupal 6 instance again, updated such that there are no known security vulnerabilities with either core or the modules we're using, and there are no broken image links. And now I need to begin working towards either grokking Drupal, or finding a content management system that my tiny brain can comprehend, because I don't want to have to go through these kinds of contortions again with future upgrades... Suggestions welcome! Saturday, August 14. 2010File_MARC 0.6.0 - now offering two tasty flavours of MARC-as-JSON outputI've just released the PHP PEAR library File_MARC 0.6.0. This release brings two JSON serialization output methods for MARC to the table:
The JSON formats should be useful for developers who don't want to have to deal with the overhead and sluggishness of a MARC parsing library (yes, File_MARC, I'm looking at you) just to deal with MARC data. Both formats are round-trippable and compact, which is why I chose to support them. The use of the json_encode() function bumps the minimum PHP version requirement for File_MARC up to 5.2.x from 5.1.x, which kind of sucks, but given that PHP 5.2.0 was released in 2006, I think it's worth it. You can install File_MARC using the 'pear' command on most environments as follows: pear install File_MARC-beta Friday, May 28. 2010Moving from Figaro's Password Manager (FPM) to KeePassXI'm one of those people who actually keeps different passwords for every site and service I use. So far I'm up to over 400 passwords, so I'm dependent on a password manager. For a long, long time I have used Figaro's Password Manager (FPM) (and KedPM and most recently FPM2 as continuations of FPM), but now that I have an Android smartphone on which I can browse without wanting to die, I've been itching to get access to my passwords on that. I noticed that KeePassDroid was available, and that KeePassX would work on my desktop. I just had to get from FPM's password export format to one of KeePass's import formats. It turns out that nobody had made that particular leap before (or hadn't shared their conversion script). Thus... I bring you the FPM to KeePass converter. A straightfoward Python script licensed under the GPL v3 that does a passable job of converting an FPM XML export to a KeePass 1.x or 2.x XML import file. It worked for me, and that's all that I needed; but maybe it will work for you, too. Monday, February 9. 2009Seven thingsI was tagged by Lukas for the "7 things" meme, and meant to do something about it, but I've been kind of preoccupied with the new baby and the sprinting toddler and work. Anyway, it seems like a heck of a lot more reasonable than the evil Facebook's "25 things" meme, so I'm going to take a few minutes to try to play along.
Link your original tagger(s), and list these rules on your blog.
Wow, that was fun. Lemme see, I'm going to break the rules and just tag two people: Helmut, because he's one of the only other people who worked on the ibm_db2 PHP driver out of passion rather than as a job assignment. And Gabriel because I like his style.
(Page 1 of 8, totaling 31 entries)
» next page
|
QuicksearchAbout MeI'm Dan Scott: barista, library geek, and free-as-in-freedom software developer.
I hack on projects such as the Evergreen
open-source ILS project and PEAR's File_MARC package .
By day I'm the Systems Librarian for Laurentian University. You can reach me by email at dan@coffeecode.net. Identi.ca microblogging
LicenseCategories |

