Better Topical Feeds From Technorati

Dec
13
2006

In addition to the ~600 sites I subscribe to via feed, I also want to keep up to date on quite a few different topics, regardless of which site they're on.

The way that's usually suggested to keep up on this stuff is to just subscribe to the tag's feed, like this one:

http://feeds.technorati.com/feed/posts/tag/linux

Unfortunately, if you look at the contents of that feed, you'll notice that the items are hardly all in the same language. At the moment, I see English, Portuguese, Japanese, German, Hindi, and Spanish. And, that's just in the 20 items it returned. While there probably are people who speak all of those languages, I certainly don't.

Unfortunately, Technorati's feeds that are driven by tags don't support the same filters that the HTML-based tag searches do and even less than the full-text search does. However, those full-text search filters CAN be applied to the search-based feeds.

So, we can actually get some pretty decent topical feeds, by messing with the regular search and the feeds it will generate.

To start with, we go to the regular search page:

http://www.technorati.com/search/linux

At the moment, that brings back 1.3 million posts. I'm pretty good at reading large amounts of information, but that's insane. If you grab the feed for that search (the subscribe link on the right at the top), you'll get:

http://feeds.technorati.com/search/linux

However, given the default level of 20 posts, you're never going to see a good set of posts. The noise level is going to be high in the feed and the rollover will be incredible. So, applying a couple of the search filters: only feeds in English (replace with your native tongue if you prefer) and posts with "high authority" and grab the subscription link and we get:

http://feeds.technorati.com/search/linux?language=en&authority=a4

This is now down to 52,000 items. The language code on the URL restricts it as far as Technorati knows for the language. There are some feeds that tag themselves as English, but still publish in other languages and this won't filter those out. The authority param on the URL uses the following values from low to high authority: n, a1, a4, a7.

"Authority" is just how many other sites that Technorati tracks link to that site. For instance, at the moment, this site has 175 inbound links according to Technorati. It's not perfect (there are actually 225 if you ask it about the domain instead of /wordpress), but does tend to give you a high level view of the news on a given topic. The authority filter tends to knock down some of the "echo chamber" effect.

The URL we're now using is pretty simple to whip up and creates a much better topic-based feed. If you're looking to dig further, including the ability to page through results, expand the 20 item limit to 100 per page, you have to move to the Technorati API and you'll need a developer key. However, it's fairly quick and you can get even nicer feeds.

Here's the same search, expanded to 100 items (not a live link due to API key):

http://api.technorati.com/search?
key=YOURKEY
&query=linux
&format=rss
&language=en
&authority=a7
&limit=100

If you tack on &start=100, you'll get results 100-200 instead of 1-100. Depending on your feed reader and how it treats repeat items (some won't show a post with the same permalink and others repeat unless it's *exactly* the same), you could put in 2 or 3 feeds on a given topic to get more depth.

Of course, if you ran it through a PHP script, you could actually aggregate the feeds together, filter out specific feeds (the "linux" search we've been using is heavily dominated by LinuxQuestions.org, which is a fine site, but a bit noisy for my needs), etc.

If you're looking at working on niche sites of any sort, this ends up being a pretty quick way to find related news without too much digging.

 

Comments on this post

Feedback is always welcome. Read some from other folks or leave your own below. Just keep things civil and remember that what you post lives on in public. Forever.

Thanks,
J

Leave Your Own Comment

By submitting a comment, you agree to license it under the terms of the Creative Commons Attribution license.

People who post comments get the added benefit of visiting the site without advertising.

© 2003-2009 J Wynia. All original content is licensed under the terms of the Creative Commons Attribution license unless otherwise noted. Content from other sources is licensed under its original terms.