Random News Updates

Halloween

We didn’t have many trick-or-treaters stopping by our house to stock up on candy last Halloween, so we’re not planning on much activity this year. And we don’t have any Halloween costume parties in our plans either. You won’t even find a pumpkin with a candle inside at our house this year. But even if Halloween isn’t a busy holiday in Dallas this year, it looks like others are going all out. Take a look these Halloween decorations (and if you think that’s cool, check out his Christmas lights).

Kittens

Our unexpected collection of kittens has reached the age where they’re ready for adoption and we’ve already found homes for a couple of them. Mother cat has now been upgraded to prevent anomalies in the kitten population. We briefly gained a baby squirrel in addition to all the baby cats. It just appeared in the garage with the kittens last Saturday. The Humane Society provided us with the number of a local squirrel and opossum rehabilitator who took the little creature in. They said thunderstorms frequently blow the babies out of their nests and cats sometimes mistake them for kittens and bring them home, which appears to be what happened in our case.

mod_virgule

Time for another mod_virgule release. One of the downsides to the growth of robots.net has been the inability of mod_virgule to handle the rapidly expanding user list. Mod_virgule had very inefficient user list code which had to parse the entire user list, sort it in memory, and then do additional lookups for each user. And it could only display it as a single page. With over 6,000 users, trying to display it brought the whole site to a crawl for 15 to 30 seconds. I’ve now completely recoded the user list functions to be many times faster and to provide the results in a nicely sorted, multi-page format. Care to see the results? Compare Advogato’s user list (which still uses the old mod_virgule) to the new Robots.net user list. As a side-benefit, the new user list pages are completely configurable through an XML file rather than hardcoded in mod_virgule itself.

Free Software

I’ve also been catching up on a couple of free software projects the last couple of weeks. I’ve posted a new version of my fork of the mod_virgule code that includes the latest patches to the official version. It also includes several new features from my ToDo list including a configurable sitemap and article index page. Both of these elements are hard-coded in the official source making it difficult to use mod_virgule without editing the source code and recompiling. I also added a simple include function to the XML markup (suggested on the mod_virgule development list). For a full list of all the goodies in my version, like UTF-8 support, password reminders, libxml2 support, etc., see the web page. Keep the patches and suggestions coming…

I’ve also posted a new version of dumpcheck, the program I’ve been using to help debug the weekly ODP data dump generation problems. In this case, I got a nice pile of useful patches from Andreas Steinmetz including a few fixes for compiler warnings and a couple of new features. Thanks!

Movies, Ultracaps, and mod_virgule

Advogato has been down for something like three weeks now. I’m beginning to wonder if it will ever come back! I asked around on the Gimp IRC channel and learned that the server lost a power supply.

Meanwhile, the nigritude ultramarine page is hovering around 12-15 in the Google results. There just aren’t enough inbound links from high PR pages to compete with all those link farms and other black hat tricks. Just over two weeks to go.

I mentioned how bad Van Helsing was in my last entry. Since then we saw The Day After Tomorrow, which was pretty good. If you liked Independance Day, and can overlook the compression of a multi-hundred year sequence of weather events into one week, you’ll probably like this one. The Dick Cheney vice president look-alike is good for a few laughs. The movie remined me of the old Fritz Leiber story, A Pail of Air – though it got a bit colder in his story.

We also recently saw the latest Harry Potter movie, which was not as good as the previous ones, and The Chronicles of Riddick, which was also okay. These two were see once at the matinee price movies but still enjoyable. Next up, the release of the original, uncut, subtitled Godzilla; maybe this weekend.

I recieved the four 350 F capacitors (yes, I said 350 Farad) from Maxwell and we’ve been playing with them down at the DPRG Lab. It takes a few minutes to charge one from the bench supply at several amps. Once charged, we were able to run a small motor (no load) on one for over two hours. And they’re designed to be exactly the same size as a standard D-Cell battery. Just the thing for a BEAM ΓΌber-photovore I think.

Mod_Virgule Update

I found some time to package up and release my latest mod_virgule code. This release brings my fork back up to date with the official code including the recentlog, project, and article permalinks and the recentlog edit patches (which oddly don’t appear on Advogato – is it not running the latest version anymore?). I’ve also incorporated a somewhat mutated form of the “forgotten password” patch Steve Kemp submitted back in Feb 2003. This release also fixes the corrupt profile.xml segfault issue and a couple of segfaults that resulted from incorrect handling of names containing UTF-8 data. The default form encoding type has been changed from ISO-8859-1 to UTF-8 throughout. It’s still not 100% UTF-8 safe but it’s getting there. At least robots.net has been very stable since the UTF-8 patches went in.

Mod_Virgule Update

I’ve been spending a little more time working on mod_virgule again. I’ve applied a modified version of the “forgot my password” patch and several other patches to bring my fork back up to date with the official version. I should have the new source packaged up shortly for anyone who wants to play with it. I’m afraid my version is begining to diverge more and more from Raph’s due to the lack of activity on the official version.

mod_virgule and UTF-8 weirdness

I’m seeing more and more UTF-8 related issues pop up in code lately for some reason. Much of the debugging work I’ve done with the ODP XML dumps has been tracking down illegal XML characters and invalid UTF-8 byte sequences.

Now I’ve run across a related bug in mod_virgule. The trust metrics on robots.net stopped working a few days ago and today I took some time to track down the reason. It turned out to be an interesting little issue with the way mod_virgule handles the storage of data in the XML database. I’ve implemented a temprorary work-around that has things working safely again but I think a longer term fix is needed.

I posted to the virgule_dev mailing list about the problem but it’s been pretty much dead for the past few months. Basically what happened is a foreign user posted some data to their user profile using a funky non-UTF-8 compatible character set. The result was a corrupt profile.xml file for that user account. That, in turn, led to Apache segfaulting during each subsequent attempt by mod_virgule to process the trust metric. Because of the segfault there was no error reporting to alert anyone of the problem and it took several days before anyone noticed that something was wrong.

The root of the problem seems to be that mod_virgule is simply taking whatever raw data a user puts in a form and passes it directly to xmlSetProp(). This works great as long you only give it valid UTF-8 data but it’s not designed to work on anything else. It seems to me that four things need to be done to fix this:

  • Pages need to explicitly specify UTF-8 as the doctype
  • All form data needs to be validated before passing to libxml
  • Invalid data needs to be converted or rejected
  • The trust metric code needs some additional error handling

If anyone has any thoughts on this or has had a similar experience with mod_virgule, I’d be curious to hear about it.