ODP/dmoz Update

(Sinus update: It’s been about one week since the surgery. I’m off all but a few of the drugs, I’m back to my usual routine at work, and I feel great; better than I’ve felt in a year. I can breathe, taste, and smell. I feel a few years younger.)

The latest RDF dump error report shows no XML character errors for the third week running. Invalid UTF-8 sequences are down from hundreds to just two this week. It’s definitely the best dump ever and I’m keeping my fingers crossed that this week’s dump will be 100% error-free at the character encoding level. In anticipation of that, I’ve started compiling an ODP RDF ToDo list of other bugs and optimizations that need work. I’ve made some progress with one of the oft-requested features for the dump which is to break the full 1GB dump into smaller, category-specific dumps. While testing things out, I’m hosting the smaller dumps locally but if they start seeing a lot of use, hopefully they’ll get moved to an ODP server with enough bandwidth to handle them.

Leave a Reply

Your email address will not be published. Required fields are marked *