Social-Filtered Search

Recently, there was a lot of discussion about running searches on Twitter, using authority as a filter. The idea is to reduce Twitter search results to only those with a minimum number of followers. The idea garnered plenty of discussion. From that discussion, I saw some perspectives that I liked:

Frederic Lardinois: I would love to have the option to see results from my own friends (or those who I have communicated with through @replies) bubble up to the top.

Jeremiah Owyang: Organizing Twitter Search by Authority is the wrong attribute. Instead, focus search by your OWN social connections. People you actually know score higher relevancy. http://www.loiclemeur.com/engl…

Robert Scoble: On both services you should see a bias of tweets made by people you’re actually following. Who you are following is a LOT more important than who is following you.

Those ideas make sense to me, because they reflect the way we seek out information. I do think there’s room for search results beyond only your friends. Here’s what I mean:

social-filtered-search

The idea above can best be described as follows:

I’ll take any quality level of search results for my close connections, but want only the most useful content from distant connections.

The logic behind this is that any quality “deficiencies” in content generated by my close connections can be made up for by reaching and having a conversation with them. That’s not something I’d do with more distant connections.

The chart above has two axes: strength of ties and usefulness signals. Let’s run through those.

Strength of Ties

Harvard professor Andrew McAfee blogged about the strength of ties back in 2007. With an eye toward employees inside companies, he segmented our connections as follows:

strong-weak-potential-ties-mcafee

The segmentation works inside companies, and it also applies in the personal world. For example, on FriendFeed, my Favorites List is akin to Strong Ties. The rest of the hundreds of people I follow are my Weak Ties. Friend-of-a-Friend entries I see are my Potential Ties. And of course there are a lot of people I never see. Those would be the “None” Ties.

The hardest part of this segmentation is that people aren’t likely to take the time to create and update their Strong Ties. Rather, Strong Ties should be tracked via implicit signals. Whose content do you click/rate/comment on/bookmark/share/etc.? Extend this out to email – who do you correspond with the most?

For example, I tried out the social search of Delver. It lets you load in your social networks, from places such as Facebook and FriendFeed, and uses content from those connections as your search index. Innovative idea. What happened though is that when I run a search, I get a deluge of content. My social networks are too big to make the service really useful.

Here’s where apps that handle a large percentage of my clicks and interactions will have an advantage. FriendFeed, with an extensive library of content from my connections, has this quality. Inside the enterprise, workers interact with a limited set of applications. The company’s IT department can set up tracking of interactions to identify implicit Strong Ties.

Bottom line: determining Strong Ties via implicit interactions is scalable and useful.

Signals of Usefulness

I’ve already described these in the paragraphs above:

  • Clicks
  • Ratings
  • Comments
  • Bookmarking
  • Sharing

Implicit data + explicit signals are the most powerful indication of usefulness.

Putting These into Place for Social-Filtered Search

When I say that I’d want to receive search results, even without many signals of usefulness, from my Strong Ties, here’s an example.

  1. I’m planning to run a marathon
  2. What marathon training plan should I use?
  3. I run a search for marathon training.
  4. I see a tweet from one of my Strong Ties: “Just started my marathon training this weekend. 4 miles FTW!”
  5. I @reply my Strong Tie, ask what training program he’s using.
  6. I now can leverage someone else’s work on this subject.

Of course, I’d want to see well-rated marathon training programs too, like Pete Pfitzinger’s Advanced Marathoning. I’d want to see the content from my distant/non-existent connections that had the highest signals of usefulness. Not unlike Google’s algorithm.

But the key here is that I’ll make up for any deficiencies in the utility of content for someone I’m close to by contacting them. A search on ‘marathon training‘ in Twitter shows a lot of results. But I’m not going to reach out to most of these folks, because I don’t know them. I only want those with whom I can have a conversation.

As I said, the ability to track both implicit and explicit activity is key to making this work. Facebook, FriendFeed, Twitter and Enterprise 2.0 all seem like good candidates for this type of search.

*****

See this post on FriendFeed: http://friendfeed.com/search?q=%22Social-Filtered+Search%22&who=everyone

Search Smackdown: Mahalo – del.icio.us – Google

I was reading the Crowdsourcing vs. Expertsourcing: A Misleading Comparison post over at Mashable. In it, Paul Glazowski analyzes a Newsweek article that suggests the bloom is off the Web 2.0 rose. Too much junk is enabled via everyday people logging on, and there’s a movement for more professional, expert information sourcing.

One example of expertsourcing is Mahalo. Mahalo was started to be a guide to Web content. Paid professionals own a topic, they research a number of sites related to that topic, and post the links that provide the best information. In their opinion, that is.

I’ll admit to some skepticism here. Google has been so good at revealing information and letting me see what’s out there. The idea of limiting my results to what someone deems worthy seems so incomplete. I’m afraid I’d be missing something that’d be really important to me.

But Mahalo has gotten some traction, so there’s something there.

I decided to run my own simple test of Mahalo, pitting it against two other ways to find relevant web content: del.icio.us and Google search. Quick backgrounder on those. del.icio.us is a bookmarking/tagging app that lets you save websites you like, and give them terms that have meaning to you. You can also find content on a given subject by searching tags, and seeing what others have bookmarked. Google is, of course, the preeminent Web search engine.

I tested three separate search terms, going from broad to specific:

  • Running
  • Marathon training
  • Tempo run

My scoring system is simple. For each search term, gold, silver or bronze will assigned based on my own subjective view.

SEARCH TERM #1: RUNNING

‘Running’ is a fairly broad topic. There are a lot of areas that may apply, making it a challenge to return results that are relevant . With that in mind, let’s see what the three search apps returned.

Mahalo: SILVER

The foundation of Mahalo’s search results is “The Mahalo Top 7”. These are the seven best links for a given topic. It is the Top 7 where expertsourcing proves its value.

The ‘Running’ Top 7 provide links to two running publications and wikipedia’s entry for running. Another link is to About.com’s page for running, itself a form of expertsourcing. A little uninspired, but a serviceable offering.

Mahalo also has several other sections in its running page. These include health-related topics, oddball sites, web tools and user recommendations. The web tools include MapMyRun.com, which lets you map a run or view others’ running routes. A user recommendation includes LetsRun.com, which is the best site for the competitive runner.

One other thing that’s good. All the links relate to the physical exercise running.

del.icio.us: BRONZE

This search shows both the power and the weakness of bookmark/tagging sites. On the plus side, I love the running results that are returned. Very interesting variety. The downside? A lot of sites that aren’t exercise running-related. Things like “Running a Windows Partition in VMware” and “Internet Explorer 7 running side by side with IE6”. In fact, 26 of the first 50 results were not related to exercise running.

There are interesting sites that del.icio.us users have posted related to running. MapMyRun.com is here. How to Select a Running Shoe by eHow.

Several, but not all of the Mahalo Top 7 appear in the first 50 del.icio.us results.

Google: GOLD

You can see how Mahalo picked its Top 7 websites…they’re all the top results in Google search! Google also returns the fun stuff in del.icio.us.

Then Google offers a plethora of other sites, and only 6 of the first 50 are not related to exercise running. Pretty much everything on Mahalo is there, plus other interesting sites. A site listing running movies. A company that sells the running skirt! Ultrarunning.

SEARCH TERM #2: MARATHON TRAINING

‘Marathon training’ is not nearly as wide open as ‘running’. This search is for someone who has a a goal in mind.

Mahalo: BRONZE

First, let me say that the bronze here is a very strong showing. If there was photo finish, you’d have a hard time telling Mahalo hadn’t won this test. The presented sites are all good and worty of consideration for anyone contemplating a marathon.

There are a variety of programs available here: Runners World, Running Times, marathontraining.com, etc. And to Mahalo’s credit, there’s no listing for Galloway’s training program! Editor bias there, I’ll admit.

I was disappointed that Pete Pfitzinger’s program isn’t shown. It’s my own favorite. But I liked the CrunchGear site, listing stuff marathoners would want.

del.icio.us: GOLD

One thing that immediately struck me this time is that all 50 of the del.icio.us results were related to marathon training. The greater specificity helped del.icio.us here. Also, “running” has several meanings, but “marathon” has few.

Several of the Mahalo Top 7 are in the first 50 results. Missing are the Running Times program, the AIDS national training program and the Boston Athletic Association program. But Team in Training is included (if you’re offsetting charity-related programs).

Several other valuable sites are here. For example, there’s McMillan Running, which includes running pace calculators and marathon time prediction workouts.

Unfortunately, Jeff Galloway’s site is bookmarked here. But…Pete Pfitzinger is included as well. Bonus points for that.

Google: SILVER

Google does its usual excellent job in its results. 6 of the Mahalo Top 7 are here; Running Times is missing from the first 50 results. Surprisingly, Team in Training is not in the top 50 results.

Google gets dinged for no race calculator in the first 50 results. No Pete Pfitzinger. But Jeff Galloway is there! Noooo…

SEARCH TERM #3: TEMPO RUN

A tempo run is a specific training technique in which you hold a fast pace over several miles. It’s a tough workout, but it can advance your performance dramatically. Obviously, we’re now in the technical weeds of running.

Mahalo: DISQUALIFIED

Mahalo has no entry for tempo running. We’ve gone too detailed for Mahalo here. DQ’d.

del.icio.us: SILVER

Use of the term “run” again confuses poor del.icio.us here. 34 of the first 50 results are not related to exercise running. But there are several good sites related to the tempo run. Runner’s World has Learn How To Do A Perfect Tempo Run. Running Times has A Tempo Run by Many Other Names.

And this is one of my favorites…a LetsRun.com post/discussion about Tempo run length vs. speed from 2003. One would have to go pretty deep into the LetsRun site to unearth that one. A true credit to the power of social bookmarks & tagging.

Google: GOLD

Incredibly, all of the first 50 results were related to exercise tempo runs. Very impressive. Lots of good info about the temp run. A LetsRun post/discussion, but different than the one on del.icio.us. Bloggers describing their tempo runs. Formal programs that advise on the pace of the tempo run. Just really good stuff.

Recap: Broad, Narrow, Technical

Broad search: Google, Mahalo, del.icio.us
Narrow search: del.icio.us, Google, Mahalo
Technical search: Google, del.icio.us, (Mahalo DQ’d)

Conclusions that I draw from this admittedly small, subjective test:

  • Mahalo is a good starting point for finding information on something that’s not familiar to you. It only covers broader, more popular categories. It does appear that the Mahalo expert just skims the top results from Google. But the clean interface and human filtering makes it a decent place to start your search.
  • del.icio.us is challenged by results that are not related to the search topic, which is consistent with its user-generated chaotic nature. It’s also a really good place to find hidden nuggets of valuable information not easily found elsewhere. And for a narrow topic with words that do not have multiple meanings, del.icio.us really shines.
  • Google still makes sense as the first place to look. Breadth and depth of results, and it takes on all comers. It also does an exceedingly good job of figuring out what sites relate to a search topic.

One final note in favor of Mahalo. There is research that shows consumers are actually better off with fewer choices than more. Give me 7 good choices, and I’ll be able to begin my journey to learn more about a topic. Give me 50 choices, some great, some terrible, and I’ll be flummoxed as I try to read them all.

Mahalo does have the advantage of providing a simple, limited set of good results to get beginners going. There is value to that.