Yahoo Keyword Extractor
As some of the RSS aggregator research and experimentation that I'm doing, I've been looking at tons of web service API's. The Yahoo API includes the regular stuff you'd expect: search API's, etc. However, it also has something I haven't seen in any of the other API's: keyword extraction. It takes a chunk of text and will give you the significant keywords from the text. Given the popularity and utility of the social bookmarking and social tagging going on, the importance of keyword analysis hard to overestimate.
As part of my new template for this site, I looked at including the Yahoo keywords as part of the metadata for a posting. It isn't perfect by any means, but generally does a good job as a 1st pass filter to narrow things down. And, for things like search engines, it's as good as many of the keyword meta tags I see. You can see the results on my new template contstruction zone (warning, breaks on a regular basis as I work on it). Just mouseover the Yahoo keywords link at the bottom of the posting to see the keywords Yahoo chose.
Here's a quick PHP function I used to make it work. You'll need your own appid from Yahoo to use it. It's also just a quick hack and doesn't properly
function suggest_keywords($content){
$url = "http://api.search.yahoo.com/ContentAnalysisService/V1/termExtraction";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_POSTFIELDS, "appid=REPLACE_ME_WITH_YOURS&query=null&context=$content");
$result = curl_exec ($ch);
curl_close ($ch);
$stripped = strip_tags($result,'<result>');
$pieces = explode("</result><result>", $stripped);
foreach ($pieces as $tag){
$cleantag = strip_tags($tag);
if($output_string){
$output_string = $output_string . ", " . $cleantag;
} else {
$output_string = $cleantag;
}
}
return($output_string);
}

February 13th, 2006 at 10:10 pm
[...] Seguindo a sugestão do Kenji dei uma pesquisada na Y! API|, e com a ajuda do próprio buscador do Yahoo! acabei encontrando o artigo Yahoo Keyword Extractor, que deu a base para a criação da nova versão alfa do script. [...]
June 20th, 2009 at 3:20 pm
I have been using it for about a 1.5 years now and ran into a problem with it. 1) it limits an IP to 5,000 queries per day - not a problem for most sites, but for large dynamic ones like myself it is a routine problem. 2) bad error handling when it breaks/times out.
But its pretty darn effective for high volume dynamic sites.
Because of the 5k per day limit I researched more keyword APIs and found that MSN has one (though I haven't looked into it yet) and also found this one: http://www.alchemyapi.com/ - its free level service allows up to 10k queries per day…the paid services start at 50k per day and go up. Between yahoo and alchemy (and any other free API services) you could string together a series of API keywords services to use be used in succession when the previous one hit its max for the day…..as I am now doing.