Making better website navigation with an Ontological Folksonomy

The thirty second version

Machine tags are useful for things other than machine-oriented information. They really end up being a shorthand of the RDF model I’ve been blogging about in several posts. The future of tag clouds and websites is going to be to enable this to work more smoothly.

The only tag advance: Machine tags

Since everybody realized that tags were a really cool thing, we’ve had exactly one advance — machine tags. Machine tags were cooked up when people started realizing that there was certain useful tagable data that wasn’t support in flickr tags, but that ought to be accessible — mostly geographical data. So they created the ability to have a predicate for extra data in tags on some of the popular systems, so that you could communicate more sophisticated ideas through a tag.

The “machine” part is really a misnomer. There’s no reason why you couldn’t do more sophisticated things with them for the user.

So, we’ve talked about what the whole semantic web effort is based upon, what works, what doesn’t work, and how a website has a hidden RDF data model built in already that we can exploit. Now, we’re ready to tackle the big bang: Ontological Folksonomies.

In my mind, there is a direction forward from a straightforward folksonomy “tagging” system. The model should be similar enough to the RDF model that you could one day suck it into an RDF triple store. But it also needs to be useful. Remember… people get tags, thanks to flickr and deli.cio.us. No matter how advanced you get, you still need to start out with allowing people to just say “Tag this picture with ‘hot babe’”

The first try

I tried doing ontological tags for wireheadarts.com recently. The main way I implemented this was letting you take a regular tag and add a definition to it that could contain a text string, a set of related tags, and a set of alias tags. So you would specify a list of tags, some of them defined, some of them not.

The problem is, this doesn’t solve the same sort of larger problems that RDF tries to solve and the way I implemented it was kinda a pain to deal with in various ways. I’d used a formatted text file that even I couldn’t get write because I was trying to bash it out as quick as I could.

But what I do like is that I could put in an explanation for what a tag meant, instead of relying on the user’s knowledge to know that the reason why the woman’s skin was so smooth but her eyes were dramatically dark was because I used infrared film.

The second try

I decided this wasn’t quite right, so I decided to start over with the basic concepts. The central goal is that, regardless of if you are talking about things that are machine-parsable or intended for human use, you want to define tags more tightly, but retain the uncontrolled aspect of things. I suggest three ways to do this:

First, you want to be able to define a tag more tightly. You want to be able to say “For just my site, when I say ‘lori’ I mean my wife’s friend Lori.”

Second, you also want to be able to define categories of tags. For example, I should be able to tell the difference between my dog Gidget and the movie character Gidget. Or 35mm film vs. a 35mm focal length lens.

And finally, you want to be able to store various other esoteric metainformation, like geographic coordinates.

The way I see things, the first two have a fairly easy solution in Rm: The tag object has a “subject” field that is a path. It has a “verb” field that is a text field and a path. And it has an “object” field that’s text with optional typed extra information.

So, a tag, like you know from flickr or del.icio.us looks like this:

{current page} --( tag: ) --> {tag}

But we can also create things like:

{current page} --( tag: /wh/people/ )--> "oren"

or, if we know two people with the same name,

{current page} --( linktag: /wh/people/ )--> "lori" | /wh/people/wifes_friends/lori/

or store detailed information like

{current page} --( geotag: ) --> "42.39561 -71.13051" | 42.39561, -71.13051

Note that sometimes you have bare strings, but sometimes you have URIs or more complex information. The system still provides the path-of-least-resistance simple tagging setup, but allows you to layer complexity on top of this.

So when I go to /wh/people/ it will now give a listing of the various people who I might be talking about and let you select just the pages that deal with a specific person. This can be set up, wiki-style, by the end-user. This is one of those cases where I realized that, if it’s just another page on the site with a specified meaning, it’s much easier to create than trying to write a distinct system optimized for handling these details.

Now, I implemented the simplest type of tags on this site when I started out. All it would let me do is simple deli.cio.us styled tags. And then I implemented more sophisticated tagging later, after I’d rolled it all out already. But this is fine, because part of the necessary feature-set of an ontological folksonomy is that you need to be able to start out easy, with simple tags, and then be able to better define tags down the road…

Things you MUST do

Don’t destroy the existing model

Users are going to be very unhappy if they are FORCED to embrace anything more complicated than regular tags from the start. So you want to make sure that the default, simplest, easiest to get to option is still the addition of plain old tags.

Allow refactoring

What you want is to be able to tag things at the beginning with “cat”, “person”, and “ruby” and be able to later on specify that “cat” and “person” are of the type “photographic subject” and “ruby” is of the type “programming language” without going in and touching every page. Similarly, you might want to change “ruby” to have a link to the Ruby website down the road. So I have a bunch of functionality on the editing interface of my site to enable this.