ODP, hierarchical organization, and other thoughts

I went to a google@work seminar in Dallas last week. It was mostly a sales pitch for Google’s enterprise services, but there were a few interesting bits such as getting a glimpse of Google’s intranet. Another thing stood out that prompted this post. Part of Google’s pitch is that hierarchical organization is dead. More than that, all hierarchical models of organization are bad. Whether it’s directories on your hard disk, folders on your desktop, folders in your email program, categorical tagging of rss feeds, or topical organization of website contents, it’s all bad, bad, bad. The one true way, they claim, is to dump all your data into a single chaotic mess and “embrace the chaos”. By which they mean, of course, purchase Google Enterprise products and services to search for what you need. After all, how else will you ever find what you’re looking for – your data is now lost in the chaotic mess. Asking a company the specializes in searching unorganized data how to organize your data strikes me as being very like asking the barber if you need a haircut. The answer will profit someone but probably not you.

Somewhere, during the powerpoint presentation, was a frame actually titled “Heirarchical organization is dead” and it was illustrated by a full frame image of the Open Directory Project’s index page. The sad thing is not so much that they used this example, but that it was such a powerful example. It generated a fair amount of laughter from the audience as the Google guy talked about how sites like ODP used to think they could manually categorize the Internet. He asked how many of the 100+ people present used (or were even aware of) ODP or similar directories for finding things on the web; no hands were raised. Then he asked how many people used search engines like Google to find things on the web: all hands raised. More laughter.

This is one of two events that recently brought home to me just how dead ODP is. The other was when I tried to log in to my ODP editor account and discovered ODP was down. A little research revealed it had been down for quite a while. Apparently there was a hardware failure back in October of 2006. AOL techs managed to bungle the restore process somehow, resulting in the unrecoverable destruction of large amounts of ODP. Then they discovered they’d forgotten to make backups for the last few years. Oops. Since then, they’ve been slowly reconstructing things. The content itself was salvaged from one of the weekly data dumps but all or most of the editor metadata was lost. Information is scarce as AOL has mostly forgotten about ODP and ODP staff continue to be very secretive about everything that goes on. While a lot of public portions of ODP are back online, a lot of the editor functionality is still down six months later. At least one of the important servers used by the editors is still offline. The really suprising thing is not just that I hadn’t noticed ODP being down but the web as a whole hadn’t noticed. There was a time when ODP being down for weeks would have been front page news on sites like Slashdot. Other than ODP editors and a few obscure SEO blogs, no one noticed it was gone.

While I don’t agree with Google’s conclusion that all heirarchical organization is bad, I think they are right in the case of web directories. It’s simply not a useful or reasonable method of organizing web sites compared to more modern social bookmarking systems like del.icio.us or reddit. It’s an adapt or die world and, sadly, ODP doesn’t seem to be the sort of organization that can adapt to the changes taking place.

I expect ODP will limp along if AOL continues to allow it but I don’t hold out any hope that ODP is ever going to fully return from the dead, I’m still an editor and I will continue to assist them with data integrity checking on the weekly XML data dumps (which have finally resumed again, by the way). However, I’m in the process of working with another editor to migrate the data dump checking process to an ODP server, so it won’t take up my time or energy anymore. I’m also spending far less time on my other ODP-related projects.

Speaking of social information processing, there was an interesting paper published by Kristina Lerman of USC this month on the subject, Social Information Processing in Social News Aggregation (PDF format). The paper looks at the way Digg exploits the power of social information processing to solve the problem of rating aggregated news stories.