I’ve been rethinking a few aspects of XML_Feed_Parser following some discussion around the web, summarised in this post from Sam Ruby. Numerous aggregators appear vulnerable to attacks based on malicious HTML in the body of comments, and that includes any based on XML_Feed_Parser that do not do their own HTML filtering/output escaping.

There was a brief discussion of the issue on the PEAR email list and I’ve decided to change the package’s default behaviour. In the spirit of PEAR, I’m going to make use of HTML_Safe to process any html or text content in the feed before returning it. There will be extra methods to access the raw content, but it’ll be an extra step so that people know they’re potentially getting dangerous content.

HTML_Safe is currently in beta, but the developers tell me there will be a stable release within the next few weeks. That means XML_Feed_Parser won’t be stable until HTML_Safe is stable, but I think in the long run that’s worthwhile as it’ll lead to more secure applications.