Data Mining, Giant Piles of Data and Amazon
First, I believe that the key to really good contextual filtering of content lies in a combination of the folksonomies that people are coming to know and love along with data mining huge piles of data. One of the biggest problems in trying to do experiments on that theory is that you've got to have huge piles of data to dig through. That usually means gathering it yourself, which usually means it doesn't ever get done. Enter Amazon.
Amazon is going to grant access to a bunch of their raw data for web searching. While not free, the fees are reasonable and "indexed" for future increases. For instance, the fees are "per GB", "per hour of CPU time", etc. And, because the prices are set rather than "call us", it opens up the possibilities for those of us who want to work with this, but don't have the money to, say, ask Yahoo to let us run 100,000 queries a day or 1 million instead of 5,000.
Very cool and I'm very excited. And, FYI, if the possibility of data mining huge piles of data makes you grin from ear to ear too . . . you're definitely a giant geek, like me.
