About these ads

Social-Filtered Search

Recently, there was a lot of discussion about running searches on Twitter, using authority as a filter. The idea is to reduce Twitter search results to only those with a minimum number of followers. The idea garnered plenty of discussion. From that discussion, I saw some perspectives that I liked:

Frederic Lardinois: I would love to have the option to see results from my own friends (or those who I have communicated with through @replies) bubble up to the top.

Jeremiah Owyang: Organizing Twitter Search by Authority is the wrong attribute. Instead, focus search by your OWN social connections. People you actually know score higher relevancy. http://www.loiclemeur.com/engl…

Robert Scoble: On both services you should see a bias of tweets made by people you’re actually following. Who you are following is a LOT more important than who is following you.

Those ideas make sense to me, because they reflect the way we seek out information. I do think there’s room for search results beyond only your friends. Here’s what I mean:

social-filtered-search

The idea above can best be described as follows:

I’ll take any quality level of search results for my close connections, but want only the most useful content from distant connections.

The logic behind this is that any quality “deficiencies” in content generated by my close connections can be made up for by reaching and having a conversation with them. That’s not something I’d do with more distant connections.

The chart above has two axes: strength of ties and usefulness signals. Let’s run through those.

Strength of Ties

Harvard professor Andrew McAfee blogged about the strength of ties back in 2007. With an eye toward employees inside companies, he segmented our connections as follows:

strong-weak-potential-ties-mcafee

The segmentation works inside companies, and it also applies in the personal world. For example, on FriendFeed, my Favorites List is akin to Strong Ties. The rest of the hundreds of people I follow are my Weak Ties. Friend-of-a-Friend entries I see are my Potential Ties. And of course there are a lot of people I never see. Those would be the “None” Ties.

The hardest part of this segmentation is that people aren’t likely to take the time to create and update their Strong Ties. Rather, Strong Ties should be tracked via implicit signals. Whose content do you click/rate/comment on/bookmark/share/etc.? Extend this out to email – who do you correspond with the most?

For example, I tried out the social search of Delver. It lets you load in your social networks, from places such as Facebook and FriendFeed, and uses content from those connections as your search index. Innovative idea. What happened though is that when I run a search, I get a deluge of content. My social networks are too big to make the service really useful.

Here’s where apps that handle a large percentage of my clicks and interactions will have an advantage. FriendFeed, with an extensive library of content from my connections, has this quality. Inside the enterprise, workers interact with a limited set of applications. The company’s IT department can set up tracking of interactions to identify implicit Strong Ties.

Bottom line: determining Strong Ties via implicit interactions is scalable and useful.

Signals of Usefulness

I’ve already described these in the paragraphs above:

  • Clicks
  • Ratings
  • Comments
  • Bookmarking
  • Sharing

Implicit data + explicit signals are the most powerful indication of usefulness.

Putting These into Place for Social-Filtered Search

When I say that I’d want to receive search results, even without many signals of usefulness, from my Strong Ties, here’s an example.

  1. I’m planning to run a marathon
  2. What marathon training plan should I use?
  3. I run a search for marathon training.
  4. I see a tweet from one of my Strong Ties: “Just started my marathon training this weekend. 4 miles FTW!”
  5. I @reply my Strong Tie, ask what training program he’s using.
  6. I now can leverage someone else’s work on this subject.

Of course, I’d want to see well-rated marathon training programs too, like Pete Pfitzinger’s Advanced Marathoning. I’d want to see the content from my distant/non-existent connections that had the highest signals of usefulness. Not unlike Google’s algorithm.

But the key here is that I’ll make up for any deficiencies in the utility of content for someone I’m close to by contacting them. A search on ‘marathon training‘ in Twitter shows a lot of results. But I’m not going to reach out to most of these folks, because I don’t know them. I only want those with whom I can have a conversation.

As I said, the ability to track both implicit and explicit activity is key to making this work. Facebook, FriendFeed, Twitter and Enterprise 2.0 all seem like good candidates for this type of search.

*****

See this post on FriendFeed: http://friendfeed.com/search?q=%22Social-Filtered+Search%22&who=everyone

About these ads

Improving Search and Discovery: My Explicit Is Your Implicit

Two recent posts on the implicit web provide two different takes. They provide good context for the implicit web.Richard MacManus of ReadWriteWeb asks, Aggregate Knowledge’s Content Discovery – How Good is it, Really? Aggregate Knowledge runs a large-scale wisdom of crowds application, suggesting content for readers of a given article based on what others also viewed. For instance, on the Business Week site, you might be reading an article about the Apple iPod. Next to the article are the articles that readers of the Apple iPod article also viewed. MacManus finds the Aggregate Knowledge recommendations to be not very relevant. The recommended articles had no relationship to Apple or the iPod.

Over at CenterNetworks, Allen Stern writes that Toluu Helps You Like What Your Friends Like. Toluu lets you import your RSS feeds and friends who have also uploaded their RSS feeds. It applies some secret sauce to analyze your friends’ feeds and create recommendations for you. Stern finds the service a bit boring, as all the recommendations based on his friends’ feeds were the same.

In the case of Aggregate Knowledge, the recommendations were based on too wide a pipe. The implicit actions – clicks by everybody – led to irrelevant results because you essentially the most popular items. In the case Toluu, the recommendations were based on too narrow a pipe. The common perspectives of like-minded friends meant the recommendations were too homogeneous.

Both of these companies leverage the activities of others to deliver recommendations. The actions of others are the implicit activities used to improve search and discovery. A great, familiar example of applying implicit activities is Google search. Google analyzes links among websites and clicks in response to search results. Those links and clicks are the implicit actions that fuel its search relevance.

Which leads to an important consideration about implicit activities. You need a lot of explicit activity to have implicit activity.

Huh?

That’s right. Implicit activities don’t exist in a vacuum. They start life as the explicit actions of somebody. This is a point that Harvard’s Andrew McAfee makes in a recent post.

Let’s take this thought a step further. Not all explicit actions are created equal. There are those that occur “in-the-flow” and those that occur “above-the-flow”, a smart concept described by Michael Idinopulos. In-the-flow are those actions that are part of the normal course of consumer activities, while above-the-flow takes an extra step by the user. A couple examples describe this further:

  • In-the-flow: clicks, purchases, bookmarks
  • Above-the-flow: tags, links, import of friends

Above-the-flow actions are hard to elicit from consumers. There needs to be something in it for them. Websites that require a majority of above-the-flow actions will find themselves challenged to grow quickly. They better have something really good to offer (such as Amazon.com’s purchase experience). Otherwise, the website should be able to survive on the participation of just a few users to provide value to the majority (e.g. YouTube).

So with all that in mind, let’s look at a few companies with actual or potential uses of the explicit-implicit duality:

Google Search

In an interview with VentureBeat, Google VP Marissa Mayer talks about two different forms of social search:

  1. Users label search results and share labels with friends. This labeling becomes the implicit activity that helps improve search results for others. This model is way too above-the-flow. Labeling? Sharing with friends? After experimenting with this, Mayer states that “overall the annotation model needs to evolve.” Not surprising.
  2. Google looks at your in-the-flow activity of emailing friends (via Gmail). It then marries the search histories of your most frequent email contacts to subtly alter the search result rankings. All of this implicit activity is derived from in-the-flow activities. For searches on specific topics, the more narrow implicit activity pipe of just your Gmail contacts is an interesting idea.

ThisNext

ThisNext is a platform for users to build out their own product recommendations. They find products on the web, grab an image, and rate and write about the product. Power users emerge as style mavens. The site is open to non-members for searching and browsing of products.

ThisNext probably relies a bit too heavily on above-the-flow activities. It takes a lot of work to find products, add them to your list of products and provide reviews. It also suffers from being a bit too wide a pipe in that there’s a lot of people whose recommendations I wouldn’t trust. How do I know who to trust on ThisNext?

Amazon Grapevine

Amazon, on the other hand, has a leg up in this sort of model. First, its recommendations are built on a high level of in-the-flow activities – users purchasing things they need. This is the “people who bought this also bought that” recommendation model. Rather than depend on the product whims of individuals, it uses good ol’ sales numbers (plus some secret sauce as well) for recommendations. This is a form of collaborative filtering.

Amazon Grapevine is a way of setting the pipe for implicit activities. The explicit activity is the review or rating. These activities are fed to your friends on Facebook. One possibility for Amazon down the road is to use the built-up reviews and ratings of your friends to influence the recommendations it provides on its website. Such a model would require some above-the-flow actions – add the Grapevine application, maintain your account and connections on Facebook. But these aren’t that onerous; the Facebook social network continues to be an explicit activity that has high value for individuals.

Yahoo Search

Yahoo bought the bookmarking and tag service del.icio.us back in 2005. It’s hard to know what, if anything, they’ve done with that service. But one intriguing possibility was hinted at in this TechCrunch post. The del.icio.us activity associated with a given web page is integrated into the search results. Yahoo search results would be ranked not just on links and previous clicks, but also on the number of times the web page had been bookmarked on del.icio.us. And, the tags associated to the website would be displayed, giving additional context to the site and enabling a user to click on the tags to see what other sites share similar characteristics.

This takes an above-the-flow activity performed by a relative few – bookmarking and tagging on del.icio.us – and turns it into implicit activity that helps a larger number of users. But with the Microsoft bid, who knows whether something like this could happen.

The use of implicit activity is a powerful basis to help users find content. Just don’t burden your users with too much of the wrong kind of explicit activity to get there. Two factors to consider in the use of implicit activity:

  1. How wide is the pipe of implicit activities?
  2. How much above-the-flow vs. in-the-flow activity is required?

Search Smackdown: Mahalo – del.icio.us – Google

I was reading the Crowdsourcing vs. Expertsourcing: A Misleading Comparison post over at Mashable. In it, Paul Glazowski analyzes a Newsweek article that suggests the bloom is off the Web 2.0 rose. Too much junk is enabled via everyday people logging on, and there’s a movement for more professional, expert information sourcing.

One example of expertsourcing is Mahalo. Mahalo was started to be a guide to Web content. Paid professionals own a topic, they research a number of sites related to that topic, and post the links that provide the best information. In their opinion, that is.

I’ll admit to some skepticism here. Google has been so good at revealing information and letting me see what’s out there. The idea of limiting my results to what someone deems worthy seems so incomplete. I’m afraid I’d be missing something that’d be really important to me.

But Mahalo has gotten some traction, so there’s something there.

I decided to run my own simple test of Mahalo, pitting it against two other ways to find relevant web content: del.icio.us and Google search. Quick backgrounder on those. del.icio.us is a bookmarking/tagging app that lets you save websites you like, and give them terms that have meaning to you. You can also find content on a given subject by searching tags, and seeing what others have bookmarked. Google is, of course, the preeminent Web search engine.

I tested three separate search terms, going from broad to specific:

  • Running
  • Marathon training
  • Tempo run

My scoring system is simple. For each search term, gold, silver or bronze will assigned based on my own subjective view.

SEARCH TERM #1: RUNNING

‘Running’ is a fairly broad topic. There are a lot of areas that may apply, making it a challenge to return results that are relevant . With that in mind, let’s see what the three search apps returned.

Mahalo: SILVER

The foundation of Mahalo’s search results is “The Mahalo Top 7″. These are the seven best links for a given topic. It is the Top 7 where expertsourcing proves its value.

The ‘Running’ Top 7 provide links to two running publications and wikipedia’s entry for running. Another link is to About.com’s page for running, itself a form of expertsourcing. A little uninspired, but a serviceable offering.

Mahalo also has several other sections in its running page. These include health-related topics, oddball sites, web tools and user recommendations. The web tools include MapMyRun.com, which lets you map a run or view others’ running routes. A user recommendation includes LetsRun.com, which is the best site for the competitive runner.

One other thing that’s good. All the links relate to the physical exercise running.

del.icio.us: BRONZE

This search shows both the power and the weakness of bookmark/tagging sites. On the plus side, I love the running results that are returned. Very interesting variety. The downside? A lot of sites that aren’t exercise running-related. Things like “Running a Windows Partition in VMware” and “Internet Explorer 7 running side by side with IE6″. In fact, 26 of the first 50 results were not related to exercise running.

There are interesting sites that del.icio.us users have posted related to running. MapMyRun.com is here. How to Select a Running Shoe by eHow.

Several, but not all of the Mahalo Top 7 appear in the first 50 del.icio.us results.

Google: GOLD

You can see how Mahalo picked its Top 7 websites…they’re all the top results in Google search! Google also returns the fun stuff in del.icio.us.

Then Google offers a plethora of other sites, and only 6 of the first 50 are not related to exercise running. Pretty much everything on Mahalo is there, plus other interesting sites. A site listing running movies. A company that sells the running skirt! Ultrarunning.

SEARCH TERM #2: MARATHON TRAINING

‘Marathon training’ is not nearly as wide open as ‘running’. This search is for someone who has a a goal in mind.

Mahalo: BRONZE

First, let me say that the bronze here is a very strong showing. If there was photo finish, you’d have a hard time telling Mahalo hadn’t won this test. The presented sites are all good and worty of consideration for anyone contemplating a marathon.

There are a variety of programs available here: Runners World, Running Times, marathontraining.com, etc. And to Mahalo’s credit, there’s no listing for Galloway’s training program! Editor bias there, I’ll admit.

I was disappointed that Pete Pfitzinger’s program isn’t shown. It’s my own favorite. But I liked the CrunchGear site, listing stuff marathoners would want.

del.icio.us: GOLD

One thing that immediately struck me this time is that all 50 of the del.icio.us results were related to marathon training. The greater specificity helped del.icio.us here. Also, “running” has several meanings, but “marathon” has few.

Several of the Mahalo Top 7 are in the first 50 results. Missing are the Running Times program, the AIDS national training program and the Boston Athletic Association program. But Team in Training is included (if you’re offsetting charity-related programs).

Several other valuable sites are here. For example, there’s McMillan Running, which includes running pace calculators and marathon time prediction workouts.

Unfortunately, Jeff Galloway’s site is bookmarked here. But…Pete Pfitzinger is included as well. Bonus points for that.

Google: SILVER

Google does its usual excellent job in its results. 6 of the Mahalo Top 7 are here; Running Times is missing from the first 50 results. Surprisingly, Team in Training is not in the top 50 results.

Google gets dinged for no race calculator in the first 50 results. No Pete Pfitzinger. But Jeff Galloway is there! Noooo…

SEARCH TERM #3: TEMPO RUN

A tempo run is a specific training technique in which you hold a fast pace over several miles. It’s a tough workout, but it can advance your performance dramatically. Obviously, we’re now in the technical weeds of running.

Mahalo: DISQUALIFIED

Mahalo has no entry for tempo running. We’ve gone too detailed for Mahalo here. DQ’d.

del.icio.us: SILVER

Use of the term “run” again confuses poor del.icio.us here. 34 of the first 50 results are not related to exercise running. But there are several good sites related to the tempo run. Runner’s World has Learn How To Do A Perfect Tempo Run. Running Times has A Tempo Run by Many Other Names.

And this is one of my favorites…a LetsRun.com post/discussion about Tempo run length vs. speed from 2003. One would have to go pretty deep into the LetsRun site to unearth that one. A true credit to the power of social bookmarks & tagging.

Google: GOLD

Incredibly, all of the first 50 results were related to exercise tempo runs. Very impressive. Lots of good info about the temp run. A LetsRun post/discussion, but different than the one on del.icio.us. Bloggers describing their tempo runs. Formal programs that advise on the pace of the tempo run. Just really good stuff.

Recap: Broad, Narrow, Technical

Broad search: Google, Mahalo, del.icio.us
Narrow search: del.icio.us, Google, Mahalo
Technical search: Google, del.icio.us, (Mahalo DQ’d)

Conclusions that I draw from this admittedly small, subjective test:

  • Mahalo is a good starting point for finding information on something that’s not familiar to you. It only covers broader, more popular categories. It does appear that the Mahalo expert just skims the top results from Google. But the clean interface and human filtering makes it a decent place to start your search.
  • del.icio.us is challenged by results that are not related to the search topic, which is consistent with its user-generated chaotic nature. It’s also a really good place to find hidden nuggets of valuable information not easily found elsewhere. And for a narrow topic with words that do not have multiple meanings, del.icio.us really shines.
  • Google still makes sense as the first place to look. Breadth and depth of results, and it takes on all comers. It also does an exceedingly good job of figuring out what sites relate to a search topic.

One final note in favor of Mahalo. There is research that shows consumers are actually better off with fewer choices than more. Give me 7 good choices, and I’ll be able to begin my journey to learn more about a topic. Give me 50 choices, some great, some terrible, and I’ll be flummoxed as I try to read them all.

Mahalo does have the advantage of providing a simple, limited set of good results to get beginners going. There is value to that.

Follow

Get every new post delivered to your Inbox.

Join 668 other followers