Advertisements

Tag Recommendations for Content: Ready to Filter Noise?

In a recent post, I suggested that the semantic web might hold a solution for managing noise in social media. The semantic web can auto-generate tags for content, and these tags can be used to filter out subjects you don’t want to see.

As a follow-up, I wanted to see how four different services perform in terms of recommending tags for different content.

I’ve looked at the four services, each of which provide tag recommendations. Here they are, along with some information about how they approach their tag recommendations:

  • del.icio.us: Popular tags are what other people have tagged this page as, and recommended tags are a combination of tags you have already used and tags that other people have used.
  • Twine: Applies natural language processing and semantic indexing to just that data (via TechCrunch)
  • Diigo: We’ll automatically analyze the page content and recommend suitable tags for you
  • Faviki: Allows you to tag webpages you want to remember with Wikipedia terms.

Twine and Diigo take the initiaitve, and apply tags based on analyzing the content. del.icio.us and Faviki follow a crowdsourced approach, leveraging the previous tag work of members to provide recommendations.

Note that Faviki just opened its public beta. So it suffers from a lack of activity around content thus far. That will be noticed in the following analysis.

I ran the six articles through the four tagging services:

  1. The Guessing Game Has Begun on the Next iPhone – New York Times
  2. TiVo: The Gossip Girl of DVRs – Robert Seidman’s ‘TV by the Numbers’ blog
  3. Twitter! – TechCrunch
  4. Injury ‘bombshell’ hits Radcliffe – BBC Sport
  5. Why FriendFeed Is Disruptive: There’s Only 24 Hours in a Day – this blog
  6. Antioxidant Users Don’t Live Longer, Analysis Of Studies Concludes – Science Daily

The tag recommendations are below. Headline on the results? Recommendations appear to be a work in progress.

First, the New York Times iPhone article. Twine wins. Handily. At Diigo gave it a shot, but the nytimes tags really miss the mark. del.icio.us and Faviki weren’t even in the game.

Next, Robert Seidman’s post about Tivo. Twine comes up with several good tags. Diigo has something relevant. And again, del.icio.us and Faviki weren’t even in the game.

Now we get to the trick article, Michael Arrington’s no text blog entry Twitter! The table turn here. Twine comes up empty for the post. Based on the post’s presence on Techmeme and the 400+ comments on the blog post, a lot of people apparently bookmarked this post. This gives del.icio.us and Faviki something to work with, as seen below. And Diigo offers the single tag of…twitter!

Switching gears, this is a running-related article covering one of the top athletes in the world, Paula Radcliffe. Twine comes up the best here. Diigo manages “bombshell”…nice. del.icio.us and Faviki come up empty, presumably because no users bookmarked this article. And none of them could come up with tags of “running” or “marathon”.

I figured I’d run one of my own blog posts through this test. The post has been saved to del.icio.us a few times, so I figured there’d be something to work with there. Strangely, Twine comes up empty. Faviki…nuthin’.


Finally, I threw some science at the services. This article says that antioxidants don’t actually deliver what is promised. Twine comes up with a lot of tags, but misses the word “antioxidants”. Diigo only gets antioxidant. And someone must have bookmarked the article on del.icio.us, because it has a tag. Faviki…nada.

Conclusions

Twine clearly has the most advanced tag recommendation engine. It generates a bevy of tags. One thing I noticed between Twine and Diigo:

  • Twine most often draws tags from the content
  • Diigo more often draws tags from the title

Obviously my sample size isn’t statistically relevant, but I see that pattern in the above results.

The other thing to note is that these services do a really great job with auto-generating tags. For instance, the antioxidant article has 685 words. Both Twine and Diigo were able to come up with only what’s relevant out of all those words.

With del.icio.us and Faviki, if someone else hasn’t previously tagged the content, they don’t generate tags. Crowdsourced tagging – free form on del.icio.us, structured per Wikipedia on Faviki – still has a lot of value though. Nothing like human eyes assessing what an article is about. Faviki will get better with time and activity.

Note that both Twine and Diigo allow manually entered tags as well, getting the best of both auto-generated and human-generated.

When it comes to using tags as a way to filter noise in social media, both system- and human-generated tags will be needed.

  • System-generated tags ensures some level of tagging for most new content. This is important in an app like FriendFeed, where new content is constantly streaming in.
  • Human-generated tags pick up where the system leaves off. In the Paula Radcliffe example above, I’d expect people to use common sense tags like “running” and “marathon”.

The results of this simple test show the promise of tagging, and where the work lies ahead to create a robust semantic tagging system that could be used for noise control.

*****

See this item on FriendFeed: http://friendfeed.com/search?q=%22Tag+Recommendations+for+Content%3A+Ready+to+Filter+Noise%3F%22&public=1

Advertisements

Explosion of Blog Aggregators…How to Keep Up?

I don’t know about you, but I’ve seen the names of a number of aggregation sites out there. It’s a very popular space, and I have not really understood who they were or what made them tick. But my growing enjoyment of FriendFeed made me wonder about what these other sites are up to. So I put together a high level survey of several of them.

There’s a really long table below. Before that, a few notes are in order.

Selected apps: This is by no means an exhaustive list. For instance, I just got into Yokway today, but haven’t had a chance to try it out. I just came up with a list from the serendipitous finds I’ve had. I also focused on earlier stage companies – no Digg, del.icio.us or StumbleUpon.

How stuff gets in there: There are three way that blog posts and news articles are added to these aggregation sites:

  • Submit: Users add a specific web page to the site, often via a toolbar ‘add’ button.
  • RSS share: Google Reader lets you ‘share’ an item in your RSS feeds that you like, posting it to your publicly accessible ‘shared items’ page, which is tracked by an aggregation site
  • RSS feed: The aggregation site takes a feed of all posts from a blog or news site

What’s interesting: Every site has its own secret sauce for what makes it tick. I tried to find things that seemed to each site apart from others.

Experience: I rate the user experience of these sites based how much was required to use them effectively. In this earlier blog post, I describe examples of light and heavy user experiences. Generally, lighter is better, but heavy can be OK for really good, distinctive features.

The point of this chart: It’s not to praise or bury any of these apps. Just to put together a list of what’s out there. If you’re an information seeker, a writer or seeking social connections with like-minded people, then you should check out some of these sites.

After the chart, I include links to other blogs with more information, plus a few thoughts as well.

Quick thoughts in dot…dot…dot fashion:

Diigo’s people matching based on common bookmarks and tags is a really cool idea, it reminds me of Toluu‘s matching based on common blog subscriptions…LinkRiver and Reddit have a very similar philosophy, with Reddit deploying a lot more categorization than LinkRiver….ReadBurner and RSS Meme are also very similar…Shyftr may have a light experience, but I’ll admit I found the overall user experience confusing right now (they’re in beta, it will improve)…Twine’s automatically generated tags for different categories was really interesting, need to explore that more…no notes on FriendFeed, just click ‘FriendFeed’ in my tag cloud for information about it…I kind of like getting my daily Social Median emails with news updates…Blog Rize has a spare UI, but it is strangely compelling…luckily, none of my blog posts have received the ‘lame’ or ‘facts wrong’ ratings on Blog Rize…

Wrapping up, here are some blog posts to get you started on the various apps:

I may be posting about some these sites in the days to come.

*****

See this item on FriendFeed: http://friendfeed.com/e/9bdd0ad9-a377-f65d-6140-8dc4e835c6c3