Rebooting my blog

I’m bringing in the new year at home, sleeping off a bad cold. Really, it’s a 2009 cold and with it will go the last remnants of that year and the last decade. It’s 2010 and time for some major changes around here. I’ve been compiling a lengthy list of New Year’s resolutions, life goals, and To Do lists. I won’t bore you with them but, if you’re reading this, one resolution is well on the way to being met.

My blog was neglected for the last half of 2009. I haven’t been totally offline. I’ve continued posting regularly to my photo blog and twitter (which feeds my Facebook, Myspace, and LinkedIn accounts) as well as making daily posts to robots.net. But my personal website has fallen into disrepair. It’s time to reboot things. First off, you may notice I’ve moved my blog to its own domain, steevithak.com, from its old home on my business website.

Over the last few years, I’ve consolidated my online presence from lots of different user names to just one: steevithak. It’s hard to spell, nobody knows how to pronounce it, but it’s uniquely me and gives me a user name that’s always available. Don’t worry, only machines refer to me as steevithak. If you’re human, keep on calling me Steve in person.

Back to my blog; I started blogging 1999 before it was commonly called blogging. I wrote my own set of PERL scripts to manage the process. So in rebooting my blog, I was faced with a 10 year blog archive in a one-of-a-kind format. The earliest blogs lacked titles and none of them were tagged with keywords, so I decided to manually convert them one at a time, adding the missing elements. Over a period of time, I reconstructed my entire blog archive using Pivot.

As the end of 2009 neared, Pivot 2.x was released, so I converted everything to that format. In December of 2009, I made a last minute decision to switch again to Word Press, which offered several features Pivot lacked. Pivot 2.x also proved to be mind-bogglingly slow, perhaps because it couldn’t deal with a 10 year archive stored in a flat file database! The conversion from Pivot to Word Press initially looked difficult but I found a script that was able to move the entries and titles. I modified it to also preserve the keywords I’d spent so much time adding.

So the new website integrates my blog, my photostream, and my twitter feed in one location. The blog will continue to be syndicated to my robots.net and Advogato.org profiles, manually for the moment but I think a Word Press plugin supporting the mod_virgule XML-RPC protocol may be forthcoming.

Now all I have to do is make life in 2010 interesting enough to blog about! I’m not worried. Something tells me we’re in for a good year.

Advogato and Syndicated Blogs

Over on Advogato, cdfrey asked whether syndicated blogs were good or bad for the recentlog. He asked whether the authors who allow their blogs to be syndicated into Advogato’s recentlog stream actually stop by to read the recentlog anymore. I’m sure some of them don’t but I’m equally sure some of them do. In any case, I can verify at least one person who syndicates to Advogato reads the recent log – me! :)

ta0kira followed up with some further comments on the topic including the question of whether other sites interleave syndicated posts with content that originates locally. There are a few such as Facebook (see below) but Advogato has always done things that were a little, ummm, experimental in nature. He has a good point that it would be nice to be able to select whether or not to see the syndicated posts. It has also be suggested in the past that an ideal solution is to give each user the ability to create their own personalize recentlog view.

One aspect to consider is that there are several ways of syndicating your blog to Advogato that may go unnoticed. Posts syndicated by RSS or ATOM are explicitly marked as such in the recentlog but syndication by XML-RPC or the older HTTP POST method go unnoted. My blog posts, for example, originate on my personal blog and are then syndicated to Advogato and robots.net using the HTTP POST method; and to Facebook by RSS. Facebook’s blog syndication services are horrendously bad incidently, turning each blog post into a nondescript thing called a “note” with no clear indication of what it is or why it exists.

Another recentlog issue that’s been mentioned several times lately is that some people are piping their twitter feeds into the recentlog via RSS. I agree this can be annoying but rather than block twitter feeds, I’d like to see them rerouted into a user status field, much like Facebook or Myspace. For example, I use twitter and my tweets update my user status field on Facebook. Maybe it’s time to add a user status field to mod_virgule sites like Advogato?

Random software and robot news

I’ve been doing a little more C programming lately. On the embedded level, I’m porting some odometery and waypoint navigation code written by David P. Anderson for use on my own robot. This is part of a larger project to put together a GPL’d library of mobile robot code. Don’t expect to see it anytime soon but we are making progress.

I’m also trying to squeeze in time to keep up the work on mod_virgule. I’ve made a lot of progress over the last few months, benefiting both robots.net and Advogato. The ToDo list seems endless but next up is some code refactoring and work on the data schemas used for the XML database and HTML entry forms. This work will hopefully allow me to fix a long standing bug in the HTML forms and make the field layouts a little more flexible.

Advogato blog topics

Zaitcev brings up the issue of blog posts that aren’t directly related to free software development in Advogato’s recentlog. I seem to remember this issue coming up sometime in the past. Zaitcev occasionally posts about topics other than free software such as Anime. This annoys ekashp, who would prefer that free software developers limit their interests (or at least their blogs) to posts about free software. For my own part, I don’t find it strange at all that free software developers have varied interests beyond software itself and I enjoy reading about them.

Perhaps I’m biased, becase I too write about whatever random things I find interesting. Sometimes I write about software but just as often it’s music, art, books, or robotics. My case is interesting because my blog is syndicated to both Advogato and to robots.net. I think to meet ekashp’s ideal, I’d have to limit my blog to software development related to robots. Otherwise, I’d risk being off-topic on one of the two sites with any given post. Instead, I throw caution to the wind and assume that if a topic is interesting to me, it might be interesting to someone else too.

In any case, Raph created Advogato’s blog ranking system so that each user could define their own ideal recentlog. If you consistently find someone’s blog uninteresting or annoying, go to their profile page and give their blog a low interest ranking. Blogs ranked below 3 will not show up in your view of the recentlog.

mod_virgule and Advogato

I’ve spent the last week making a lot of mod_virgule tweaks to get my version of the codebase to the point it can run both Advogato and robots.net. My goal is to make the upcoming change-over to the new server and codebase as unnoticeable as possible to Advogato users. Keep you fingers crossed. Next up is trying to get mod_virgule to compile cleanly with gcc 4.x on a 64bit machine.

Websites come and websites go

O’Reilly shut down the wonderful Meerkat aggregator website a while back and now I see that Raph may be shutting down Advogato.

Yet another website I frequent, on the subject of Free/Open hardware, almost shut down in June. FreeIO.org has been run by Diehl Martin for years, providing GPL’d hardware designs. Unfortunately, Marty has pancreatic cancer and decided he didn’t have time to run the site anymore. After seeing his shutdown announcement, I offered to take over hosting and maintenance of FreeIO.org. We completed the transition in August. We’re still pondering the longer-term plans for the site.

I’ll miss advogato if it shuts down. Thanks for the effort you put into Advogato and mod_virgule, Raph! I’m still running robots.net on my own version of mod_virgule but I’m guessing mod_virgule is not likely to survive long without Advogato, so it may be time to rename my version and go off in my own direction with the code. There are lots of enhancements I’ve wanted to make like dropping the specialized XML templates in favor of standard XHTML templates, replacing the template processing code with the XPATH support available in libxml2, maybe replacing the flat-file db with SQL support. There’s still a lot of potential in the codebase; it’s just a few years behind the times at this point.

I still haven’t found a good replacement for Meerkat. I’m trying to use Technorati. It looks pretty but it’s riddled with technical problems and there are apparently no humans at the other end to talk to, just faceless form emails promising replies that never arrive. Technorati only manages to update from robots.net maybe one out of every ten times I ping them. Complaints about the Technorati ping problems and their non-existent technical support are common. I’ve tried several of the recommended work-arounds for the ping problem but none have worked for me so far.

mod_virgule Attack Resistance

lkcl and redi have commented on the ongoing trust metric attack on mod_virgule sites, noting the effects on Advogato. The same thing is happening to other mod_virgule sites including robots.net and ghostscript. I emailed Raph a warning about this activity in May when I first noticed the use of automated programs creating large numbers of identical accounts on the three sites. I don’t want to link to any examples directly but try googling on “dltxprt” or manually typing in the user URL to see an example user on all three of the mentioned sites. I’ve been tracking IPs and the account names on robots.net so I can kill them all off if needed but so far the trust metric has resisted the attack effectively.

The spammer is using the notes field of each account for search engine link spamming but otherwise isn’t causing much immediate harm other than resource abuse. I have working code to delete mod_virgule accounts but I’m still pondering how best to use it to remove the evil doers in this case.

The blog spam seems limited to Advogato for some reason. If it starts on robots.net, I think my solution will be to remove the A tag from the list of tags that can be used by observers. I don’t want to remove the ability of observers to post blog entries, as lkcl suggested, because that’s the only way we find out enough about some new users to decide whether they should receive a higher trust ranking.

One interesting thing to note is that almost all of the spammer’s accounts certify each other, creating what Google refers to as a “bad neighborhood” in webpage trust rank terminology. If you have a legitimate webpage and link to a “bad neighborhood” it can adversely affect your own page’s rank. It might be wise to implement something similar in mod_virgule. If a legitimate, trusted user certifies an untrusted user in a “bad neighborhood”, maybe it should result in decrementing the trust of the legitimate user rather than increasing the trust of the bogus user? Just a thought.

Mod_Virgule Update

I’ve posted another release of my mod_virgule fork this week. Grab the source or take a look at the changelog. This one includes the new and improved configuration handling code. Instead of loading and parsing the entire website configuration on each and every hit, it’s now loaded once per Apache process and stored in thread-private memory that persists across requests. Much more efficient. I also did some more general code cleanup and removed more of the hard coded stuff that makes it hard to use mod_virgule without editing the source code. There are still one or two hard coded things that I need to make configurable. Maybe in the next release. It’s getting close to a completely configurable system that could be compiled, installed, and configured for any site. Anyway, the new code has been running on robots.net for a couple of weeks now and appears stable.