Nov 19, 2003


Matt Webb:

There’s some interesting potential in auto-generated weblogs with small human involvement. I wonder if you could use the Bayesian systems that work on figuring out spam email to look at everything being linked on Blogdex, and train it to post only things you would want to post anyway. You could have a more-or-less “original” weblog by training the network to rank higher less-or-more popular links.

This should be built into the RSS aggregator tools; the framework for this kind of functionality is already in FeedDemon. Instead of keyword searching, do Bayesian classification for posts that you have trained the system to find interesting, and put that in a watch feed. FeedDemon already outputs RSS for that watch list; transform it to Atom or something metaweblog friendly and auto-post it back to the web. Voila, instant sidebar. (Note: Bayesian classification isn’t just useful for binary decisions (spam/not, interesting/not), but as POPFile has demonstrated, classification into a variety of categories. Have one “interesting” feed published to your public site, another “interesting” feed published behind the firewall. And yeah, I know I’m harping.)