As some of you who use this site via its RSS feed have undoubtedly noticed over the last couple of days, those feeds pretty much entered FUBAR territory. Instead of nicely structured XML data, there was an error about the XML entity not coming at the beginning of an external file.
This weekend, Tim McGuire saw it and let me know. I cleared the cache and it seemed to go away. Of course, like all good problems, it came back and the old remedy wasn't any match for it.
So, what was the problem you ask (and lots of people on lots of forums have been asking about this very problem)?
1. If there is any space, extra line or anything at all before the initial <?xml at the beginning of the output, the strict validators will puke on it.
2. Are you serious, J? Yes. An extra empty line in front of the first text is what was breaking it.
So, it's just a matter of removing the line and my chest can swell with geeky pride, right? Well, not so fast. See, the empty line wasn't in wp-rss2.php.
Where was it? I still don't know. The general consensus on this bug was that any one of the installed plugins, on any one line (depending on the plugin) could be contributing the extra line. Which, of course resulted in a gigantic case of
Whiskey
Tango
Foxtrot
Never one to voluntarily take on the drudgery of manually digging through files while the live system is up and running, I whipped up a quick solution that fixes it in the short term, while I try to figure out which one of the 25+ plugins is outputting a single empty line.
So, I used output buffering and PHP's trim() function to just strip off extra whitespace before and after the content. The stupid extra line is still there, but is suppressed, making the feed OK for use. In other words, the tumor is still there, but the headache is hidden by aspirin.
If you'd like to put this change in for your own Wordpress site while you, too, play "Where's Whitespace" with your code tree, do this:
Add:
ob_start();
as the first line inside the PHP snippet in wordpress/wp-rss2.php (and wp-rss.php, etc. for the different feed types).
Then, at the bottom of the file, add this snippet.
<?php
$output = ob_get_contents();
ob_clean();
print(trim($output));
ob_flush();
?>
If you're using any plugins that mess with output buffering already or any fancy header control, this may not work, but it did on this site and patched things over while I play oncologist and go digging for the little tumor.
While I'd be irritated at the author of whatever plugin is causing the problem, to me every tool along the way should be working against this being a problem in the first place. I understand how putting an ampersand inside the file would cause an error. But, come on, whitespace inside the tags is ignored, so why is the same whitespace inserted in the beginning enough to make the parsers choke? And, why doesn't Wordpress already do something similar to my fix to prevent poorly written plugins from breaking the whole thing because they hit an extra ENTER key.
Beyond them, why does PHP implicitly flush the headers at the first sign of whitespace? Or at least allow you to turn it off? Wordpress isn't the only PHP app to run into this problem. Forums are full of people completely clueless as to why they're getting errors only to discover that that little extra line at the top of the file is breaking their entire script.