On the semantic web, RDF, and other such things

It’s coming up in both Rm and Photohub, and I’ve got some rants about it, so there’s a few entries on tap about the Semantic web and RDF and the like. So, let this blog entry serve as a brief introduction where I try to at least introduce some of the terms.

The Semantic web is a huge, massive, colossal effort of a bunch of web folks to try and turn the existing framework of the web into something more “Semantically enabled”, something that will let us write truly intelligent software that will do amazing things.

Kind of like AI was… and then Machine Translation before that… and… ehrm… you get the idea.

The way I see things, there’s two parts to the Semantic Web. One sizable chunk of the effort is the same old crap that people have been trying to do for years and years and years: Make machines that can think on their own.

The machine translation and AI efforts did actually produce results, mind you. Generally, what happens in these sorts of cases is that people “wash” the terms that failed from a technology and put it into use. If you think about it, Google’s current search engine is the direct result of early machine translation and AI research… it’s just that nobody’s trying to claim that it can think.

There’s a second part to the Semantic Web effort, however, which is that it will produce useful things that will go on to be useful, even if the root effort doesn’t produce something handy. I have come to the decided conclusion that the basic, raw notion of an RDF triple is a useful tool of discussion about the structure of the modern Web.

Notice I said RDF triple, however. I didn’t say RDF.

The concept behind RDF

RDF is, once you strip things down, a very hard way to say this thing:

Subject -- Verb --> object

So you can think of a link as the verb “links to” where the subject is the page you are on and the object is the page the link goes to. Of course, there’s layers upon layers of stuff on top of this that tries to machine-automate useful operations based on it, but I’m not going to bother explaining it.

Semantic models

I’m using this term to describe a piece of software that appears to do something useful on a bunch of data. Something more complicated than “give me a list of all pages that contain the word ‘porno’”

The Semantic Web tries to enable the creation of these. If you think about it, Google is really just a semantic model, just one that doesn’t need RDF to work. It sucks up almost all of the pages on the net, does some operations on it to try to get more information out of the page than a naive textual search, and produces seemingly enhanced results.


Cory Doctorow has explained this better than I.


When I use this word, I usually mean it. I do not mean something that appears to be intelligent. I mean something that actually is intelligent in an impressive Turing-test-passing sort of way.

The problem is that AI works very well in science fiction. It lets you take out some part of the shared human experience… emotion being a popular choice… and remove it, giving you an opportunity to pose interesting questions, or at least bore your reader by posing questions that were original when they raised them in the 60s, even though they kind of ran out of room to talk about them at the end of the 80s. So it’s popular in the fiction world, but the reality hasn’t happened yet. And none of the problems are evident until you’ve spent a bunch of time trying to do it. And even then, sometimes, you don’t quite get it.

It’s a subtle sort of brainbug where you think that because you can conceptualize it eagerly in science fiction, it must be possible.

Ontological Folksonomy

This is my current information architecture obsession.

A Folksonomy is a the current nature of the tag-based web. You apply tags to content on the web with the express goal of enabling other people to find it. It’s not what information science folks call a “controlled vocabulary” because I may tag my picture of a naked woman as a “nude” while you are probably going to tag it as a “hot nekkid babe”. And there’s the guy who tags his flower with the “nude” picture because he wants to get more hits on it.

Anyway, I’m pretty sure there’s a way to build upon the success of folksonomies to add some “meaning” to tags, which I will write further about in future entries…