Learning By Copying, Conversing, and Interacting

Ryan‘s been writing some thought provoking posts on microformats and related topics of late. In The Self Organized Web he pulled up this (two year old) quote from Tim Bray:

RDF has ignored what I consider to be the central lesson of the World Wide Web, the “View Source� lesson. The way the Web grew was, somebody pointed their browser at a URI, were impressed by what they saw, wondered “How’d they do that?�, hit View Source, and figured it out by trial and error.

Early adopters and other techies learn by looking at others’ source code, and I suspect it’s right to say that the web took off because of those of us who looked at someone else’s HTML and said “I can do that.” But it’s innovation, not general content production that’s being sustained by those people. Increasingly web content isn’t being produced by-hand, or even using HTML editing tools, but through content management systems (from blogging tools on up). In his entry Tim responds to that, preceding comments on the RDF/XML syntax with:

At this point, the RDF evangelists pipe up and say “Well, Ordinary People ™ don’t have to look at the source, there will be tools to sort all that out.â€? Sorry, I just don’t believe that. If, in 1994, you’d needed DreamWeaver or equivalent to write for the Web, there wouldn’t be a Web today.

While that’s undoubtedly true (and I’d agree that RDF/XML is not a nice syntax), I’m not convinced that the fact that one medium grew up without a particular toolset is reason enough to argue against another that requires a toolset. For myself, I learned most of my early HTML and coding skills from examining other peoples’ work, but I don’t have the wherewithal to have stopped there. Before I came into conversation with other people producing web pages and writing code, my HTML was sloppy and my code made little attempt at abstraction.

The abstraction may have come with time, but regardless of that, as the increasing prominence of Design Patterns illustrates, the formalism that allows tools to play well with others is something that arises once we’ve moved on from that initial learning by imitation to a stage of learning through conversing, and learning through working together.

If the Semantic Web is going to take off something may well need to be done to ease the transition from copying what works to understanding why it works. Semantic Web toolsets are less forgiving than web browsers have been. When using them it is more important that people have good examples to work from (not an equivalent of the slapdash HTML so many of us first encountered) understand why things are done within the parameters that they are.

The growth in knowledge of web standards, and increasing concern at all levels for semantically rich XHTML may be an important part of introducing people into the ways of thinking that RDF requires (or at least the problems it is designed to solve). Microformats, for example, are both a useful tool for structuring XHTML documents and also an entry-point into a growing realisation of the many facets of producing semantically-rich information representations.

In that regard, I’m very glad to see comments such as Danny Ayers’ latest piece of microformats that bridge the sometimes-portrayed-as-polarised camps of upper and lower case semantic web thinking. Perhaps there’s also space for some appropriately publicised articles on the limitations of the flat namespaces that HTML provides. This introduction to RDF by Tim Bray and edited by Dan Brickley is a good start, but there need to be more.

At the same time, those of us building tools bear the responsibility for making it easy to make users’ data semantically rich. It’s not hard (and it’s getting easier) to use XSL or some equivalent to produce multiple representations of a document, given enough metadata. When the representation is for a web browser, microformats provide a good framework for differing types of content, but scraping is not going to be enough to extract the more nuanced metadata that I believe we’ll increasingly rely on and for that, some variant of RDF is probably our best choice at present.

UPDATE: This piece by Eric Meyer (published a couple of hours after this entry went up) is well worth a look on the topic of microformats and semantics.

2 comments

  1. Oh, all we need is a cool hack or two, and some decent toolkits for the common PHP web developer.

    There’s no real dead simple triplestore / scutter / parser stuff that I can just shut up and use in what I know best, and that’s a major stopping block to me actually building a sem web app.

    Sure there’s an attempt or two with the PEAR RDF stuff, and there’s Redland, but I can’t set it up on my windows laptop.

    So, my dream semweb development kit:
    Apache
    PHP
    Some kind of triplestore w/SPARQL inplace of a RDMS (or operating ontop of).
    PEAR package that makes it as simple as PEAR::DB…

    $foo = new Collection($url);
    $result = $foo->query(‘SELECT ?x’);

    var_dump($result);

    … but there’s nothing there. Not for someone who’s afraid of linux. Not for someone who doesn’t want to have to deal with unknown terms, I just want it simple.

    *sigh*.

  2. That definitely sounds good. I tend to use python when I’m working with RDF, but keep meaning to spend some time with RAP which is used for Wordform. It would be good to see the PEAR RDF classes up and running.

    There’s also been some work of late on Redland for Windows.