I'm Not Actually a Geek

Observations on technology and business from someone who should know better

  • Home
  • Top posts for: Innovation
  • Who is this guy?
Posts Comments

Semantic Web = Tagging on Steroids

February 17, 2008 by Hutch Carpenter 1 Comment

I read a nice list of “11 Things to Know About Semantic Web“, over at ReadWriteWeb. “Semantic Web” is an intimidating term. “Semantic”…hmmm, that’s something to do with words, right? Don’t people say, “he’s just using semantics” in a pejorative way?

I’ve researched it a bit, and here’s my initial attempt to put it in layman’s terms. The semantic web takes a document of unstructured data (say, this blog post), and renders it into a set of tags that are readable by both humans and software programs. Not just any tags, but really powerful tags beyond what you or I would use.

Now, at this point, that sounds sort of redundant. Don’t a lot of pages already have user-created tags? Can’t search engines find the words that are in the web page? If you know a little HTML, aren’t there metadata tags?

Well, it turns out those various methodologies are great for humans, but do little for machines. That kind of makes sense, right? I mean, we all get that machines’ ability to interpret our written words is limited. Google doesn’t really make a connection between your search term and the pages it serves up. It just looks for instances of your search terms and then applies all that special magic it does (number of links to a page, number of times the page was clicked previously, etc.).

The key to making these unstructured web pages readable by a machine is something called…RDF. RDF stands for Resource Description Framework. Here’s RDF from Wikipedia: “The RDF metadata model is based upon the idea of making statements about resources in the form of subject-predicate-object expressions, called triples in RDF terminology.”

That “triples” thing (or, I’ve seen it as “triplets”). The idea of casting unstructured web data into subject-predicate-object is apparently quite powerful. Again, from Wikipedia: “In the English language statement ‘New York has the postal abbreviation NY’ , ‘New York’ would be the subject, ‘has the postal abbreviation’ the predicate and ‘NY’ the object.”

At this point, people much more versed in these technologies can explain how computers will use these triplets to better serve up content for a given search. For instance, Reuters has come out with its Open Calais initiative. They aim to “make all the worlds content more accessible, interoperable and valuable.” I will do some more research and write a follow-up post on this subject.

But I do want to note a few of the points provided in Bernard Lunn’s post over at ReadWriteWeb:

  • Semantic Web will start the long, slow decline of relational database technology. Web 3.0 enables the transition from “structure upfront” to “structure on the fly”. The world is clearly too complex to structure upfront, despite the tremendous skills brought by data modelers. Structure on the fly is done by people adding structure as they use the service and by engines that automatically create structure from unstructured content.
  • Don’t look for a killer app. That implies a client/consumer win. This is much more likely to be a server/platform/enterprise win.
  • Semantic Web could slow the Google steamroller. This could be like the PC for IBM or the Web for Microsoft. The steamroller’s momentum carries it forward for a very long time and it can build all kinds of wrapper systems around it, but something new always does come along. Google mastered how to give some structure to countless unstructured HTML pages. Semantic Web will gradually make that less critical as the underlying content will be more structured.
  • Tagging is the quietly disruptive technology. Everybody tags. It is the most basic human urge to mark what we find.
  • Semantic Web will leverage the “community” to add structure and this will use some techniques from first generation Social Networking. But it is very unlikely that Semantic Web will emerge from the walled gardens of current social networking sites.

Final note. I ran this post through a free website that employs Reuters’ Open Calais protocols, “Calais Text Tagger“. It returns a lot of text chock full of semantic tags. I won’t repeat that here. But I did like this little output:

IndustryTerm: unstructured web, search terms, relational database technology, software programs, wrapper systems, given search, unstructured web data, search term, semantic web
Company: IBM, Reuters, Microsoft, Google
Person: Bernard Lunn

Gotta say, that was pretty slick. And it’s more tags than I’m applying to this post.

Advertisement

Filed under geek Tagged with metadata, open calais, RDF, semantic web, tagging

Who is this guy?


Click my pic to find out

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 791 other subscribers

Subscribe to this blog

  • Click here to get blog feed

Latest tweets

  • @NativeSonn Frustrating that left side of the political spectrum has a hard time with this: "Bynum was the sole Dem… twitter.com/i/web/status/1… 8 hours ago
  • @danny_funaro @TheBaddestMitch Meme hall of fame 1 day ago
  • @PregameEmpire @JaredBerson It just “feels” like Zion made that Final Four… 2 days ago
  • @CBKReport Ja Morant, he was something even in college. Overshadowed by Zion to a certain extent. 2 days ago
  • @haslametrics Need that sarcasm font 5 days ago

The Conversation

  • Hutch Carpenter on 10 examples that show the value of cognitive diversity
  • Ursula Brinkmann on 10 examples that show the value of cognitive diversity
  • Liens de la semaine (weekly) on A Method for Applying Jobs-to-Be-Done to Product and Service Design
  • Resource Bundle: Governance & Moderation on Management by Community
  • Ranjeeth on Bring customers into the idea review process
  • Ramkumar Yaragarla on 16 metrics for tracking Collaborative Innovation performance
  • Final Essay: Punk Rock DIY Work Ethic in Graphic Design and Art: The Origins, Influences and Impact. – EFS Art and Design on Is Crowdsourcing Disrupting the Design Industry?
  • Tim Woods on I’m joining RevolutionCredit as Chief Scientist
  • Samuel Driessen (@driessen) on I’m joining RevolutionCredit as Chief Scientist
  • Diane Court on I’m joining RevolutionCredit as Chief Scientist

Recent Posts

  • I’m joining RevolutionCredit as Chief Scientist
  • Stats say Hot Hand is real | Klay Thompson’s amazing quarter
  • Why Amazon wins | Innovate the core, innovate to transform
  • 16 metrics for tracking Collaborative Innovation performance
  • 10 examples that show the value of cognitive diversity
  • How self-driving vehicles can fix the San Francisco housing crunch
  • Avoiding innovation errors through jobs-to-be-done analysis
  • Consumer adoption of Bitcoin | A jobs-to-be-done analysis

Tag Cloud

adoption amazon apple atlassian blog blogging blogs business week careers collaboration confluence connectbeam crowdsourcing del.icio.us digg diigo discovery email enterprise 2.0 facebook flickr forrester foursquare friendfeed gartner gary hamel google google reader google wave idea management ideas information filters innovation innovation management iphone jobs to be done jtbd lifestream linkedin louis gray marketing microblogging microsoft myspace open innovation parenting pay by touch politics product management recommendations reputation roi RSS scoble search sharepoint slideshare social bookmarking social business social media social networks social software socialtext spigit tagging techmeme toluu Twitter web 2.0 wiki wikis workplace yahoo yammer youtube

Posts By Date

February 2008
M T W T F S S
 123
45678910
11121314151617
18192021222324
2526272829  
    Mar »

Archives

  • November 2015
  • October 2015
  • September 2015
  • August 2015
  • July 2015
  • January 2015
  • September 2014
  • June 2014
  • May 2014
  • April 2014
  • March 2014
  • February 2014
  • January 2014
  • December 2013
  • November 2013
  • October 2013
  • September 2013
  • August 2013
  • February 2013
  • January 2013
  • November 2012
  • July 2012
  • June 2012
  • May 2012
  • April 2012
  • February 2012
  • January 2012
  • December 2011
  • July 2011
  • June 2011
  • February 2011
  • January 2011
  • December 2010
  • November 2010
  • October 2010
  • September 2010
  • August 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • February 2010
  • January 2010
  • December 2009
  • November 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • October 2008
  • September 2008
  • August 2008
  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • February 2008

Meta

  • Register
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.com

Blog at WordPress.com.

  • Follow Following
    • I'm Not Actually a Geek
    • Join 259 other followers
    • Already have a WordPress.com account? Log in now.
    • I'm Not Actually a Geek
    • Customize
    • Follow Following
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar