Whatever happened to the feed parser?

It’s been quite some time since I last wrote about the feed parser I’ve been working on. I actually had a piece prepared a couple of weeks ago, but before I posted it some discussion arose on pear-dev about possible feed parser projects and so I have been waiting to see what became of that. That conversation has died down, so it feels like time for a summary of progress to date.

Since I last wrote I have rewritten the package from scratch with a new design. It was quickly becoming overly complex to maintain appropriate strictness while parsing all feed types in one class, so I’ve broken the design into a general class which manages the interface, and specific classes for each feed type.

I’ve so far implemented support for Atom 1.0 and RSS 2.0 with some mappings so that a few common elements will behave the same way between both types. In general, I’m using Atom as the normative feed type. Atom 0.3 is handled, but a warning is sent and a few features (notably getEntryById) won’t work because of the changed namespace.

The discussion on pear-dev focussed on two key questions. The first was whether a generalised feed package (reading and writing) was preferable to single-purpose packages, and the other how comprehensive support for different formats should be. I’m increasingly convinced that separate packages for parsing and creating feeds are preferable. Many of us use templating systems to generate feeds and so would not need the added overhead of feed writing classes, and focussing on one task allows for maximum efficiency.

The more time I’ve spent with the RSS 2.0 spec, the more convinced I am that it is important to not lose sight of the specifics of the different formats in the name of abstraction. The Atom spec is far clearer and provides for much more effectively (imho) structured information than RSS 2.0. It would be a great shame to lose any of that. As far as possible I’ve focussed on providing a consistent surface for access to common elements, without denying access to more specific and detailed content.

There’s still quite a bit to do, but I’m beginning to use this code in my regular development, so progress may get a little faster. A tarball of the current version is available here.

Tags: , , , , ,

Comments are closed.