Why Isn’t This the Tag Standard? Multi Word, Comma Separated
August 22, 2008 14 Comments
Tagging is a great way to put context on user generated content. The tag cloud to the right shows what the hundreds of thousands of blogs were talking about on the evening of August 21. (Click the image to see what bloggers are talking about right now).
Pretty much any web 2.0 service that has user-generated content supports tags. Flickr. YouTube. Del.icio.us. Google Reader. Last.fm. Tagging is entrenched in the web 2.0 world, and it’s one of those idea that spread without any standards.
But there is a problem of no single standard…
Beta, VHS.
Blue-Ray, HD-DVD.
Space or comma delimited?
What’s happened is that tagging formats are all over the map. Each web 2.0 service came up with what worked best for its product and developers:
- Single-word tags only (Delicious, Atlassian Confluence)
- Tags are space separated, multi-word tags must be put in quotes, like “enterprise 2.0” (Diigo, Flickr)
- Tags are comma-separated, multi-word tags are free form (Google Reader, WordPress.com)
This post at 37signals described the same tag formats above, and it got a lot of comments. Good energy around the subject. Brian Daniel Eisenberg thinks the failure to have a consistent tag method may undermine its adoption by the masses.
To me, there really is one best format.
Multiple Words, Comma Separated
I tweeted this on Twitter/FriendFeed:
Can there be a universal standard for tags? Multi-word tags, comma separated. Odd combos (underscore, dot, combined) are messy, inconsistent.
You can see the comments on the link. The gist of them? Multiple words, comma separated is the best format. Here’s why I think so:
- Forced separation of words changes their meaning (“product management” or “product” and “management”)
- Forced separation of words creates tag clouds that misrepresent subjects (is it “product” content? or “management” content?)
- With single terms, too many ways for users to combine the same term:
- productmanagement
- product.management
- product_management
- product-management
- Writing multiple words with spaces between them is the way we learn to write
- Putting commas between separate ideas, context, meanings and descriptions is the way we write
Let people (1) use more than one word for a tag, (2) written naturally without odd connectors like under_scores, and (3) using commas to separate tags. These rules are the best fit for germanic and romance languages, and I assume for most other languages as well.
To Brian’s point about the masses, let’s make tagging consistent with writing.
For Developers, It’s Pretty Much a Non-Issue
In The Need for Creating Tag Standards, the blog Neosmart Files writes:
Basically, it’s too late for a tagging standard that will be used unanimously throughout the web.
A lot of developer types weighed in on the comments. For the most part, they’re sanguine about the issue of different formats. Rip out any extraneous characters like spaces, periods, underscores, etc. What’s left is a single string that is the tag.
It’s About the Users
The issue fundamentally is how boxed in people are if they want to tag. In the Neosmart Files post, commenter Jason wrote this:
As this topic suggests, there are issues in resolving various tags that whilst literally different they are contextually equivalent. I believe this to be the critical juncture. Perhaps the solution lies not in heaping upon more standards, but improving the manner in which tags are processed by consumers.
From my perspective, multiple word, comma separated format is the most wide open, flexible way to handle tags. If a user likes running words together, he can do it. If a user wants to put underscores between words, she can do it. If a user likes spaces between words, not a problem.
But making users cram together words in odd combinations takes them out of their normal writing and thinking style. Tags should be formatted with humans in mind, not computers.
That’s my argument. What say you?
*****
See this post on FriendFeed: http://friendfeed.com/search?q=%22Why+Isn%E2%80%99t+This+the+Tag+Standard%3F+Multi+Word%2C+Comma+Separated%22&public=1
This is an issue that has before crossed my own mind, and I’m a supporter of multiple words, comma separated. It just slightly throws me off track, even if for a moment, when I’m forced to tag otherwise. However, I must agree that it may be a bit too late to establish a single method of tagging. As the technological age advances, new and innovative ideas roll out. Tagging is just one of the many. But just as all good things go, something replaces them. I’m not saying that tagging will be replaced anytime in the near future, but perhaps it will be modified. If we’re truly in Web 2.0, Web 3.0 will be rolling around soon enough, yeah?
I totally agree. I was a little disappointed when Delicious 2.0 didn’t introduce this. Delicious is now the odd one out of all the tagging, bookmarking services I use regularly: Last.fm, Tumblr, WordPress, Mento.
Nice article.
User and usage will probably drive the standard or, if not, the preferred way for tagging.
I’m also in favor of multiple words separated by comma. This is the most natural way to describe something with words or combination of words.
Pingback: Toluu Rolls Out Tagging {cool} {powerful} {discovery} {easy} « I’m Not Actually a Geek
I have loved your site for its useful and funny content and simple design.,
Great article, I like the comma method myself and think it’s the best of the available options, but I can see the use for space separated.
If I went to tag something and put in (monkey, banana, monkey zoo) the last monkey isn’t really necessary if someone searches for “monkey zoo”, you can still find that entry by just storing the tags banana, monkey and zoo.
So “normalizing” the tags like that isn’t such a bad idea in some cases, but what if you want a tag page for “monkey zoo” as one word. :X
Space separation causes a problem when you DO want to enter a word with spaces, because like you show there is no natural way of doing it (double quotes makes no sense to me and is not natural) which makes people more likely to put in odd things like grand.theft.auto.3, big_bang_theory or beastie+boys. It happens, because they really have to guess as to how they want to combine their words.
Even with comma separated there are problems though, for example you can get multiple tags like playstation 3, play station 3, playstation3 or hip hop and hiphop.
That is where I usually give up and just allow both, but which would you use in that case? 🙂
I don’t think there ever will be a standard because both space and comma separated have their place. It all depends on how many tags you have, the subject and what you plan on using them for.
Maybe in the end we will have a tag_word and tag_phrase table to store both methods. 😛
Fine article! I suffer too under this tag chaos. We need a standard for multi word tags all over the web. And we need it FAST!
Ryan – great breakdown of tagging use cases. I do see the argument for space separated, then running a search on any combo of tags. But even that use case can be accomplished with comma separated too. Tag normalization is the bigger issue. Strip out the spaces underscores, dashes, periods, apostrophes, etc. Then comma separated makes even more sense.
I have wondering about this since tagging came about. Every site has their own standard. The comma does make sense. It’s natural and an extension of writing. And it’s amazing I don’t run into people talking about this issue more often.
I believe a lot of the push-back against the use of spaces in tags comes from the technical difficulties involved in using them that aren’t apparent to the end user.
Firstly, you can’t use spaces in URLs: they’re considered an end delimiter, so they have to be escaped to %20. Which means you get poor-looking tags that are hard to type or read like http://example.com/tags/three%20word%20tags or ambiguous tags like http://example.com/tags/three-word-tags, which seems fine, but what if my tag actually was “three-word-tags” without spaces?
Beyond the URL, you have issues that are similar to why white space is largely ignored in coding syntax: is “three word tag” (one space between words) the same as “two word tag” (two spaces between words)? What about a tag list like “foo, bar”? Should the second tag be “bar” or should it be ” bar”? If it’s the former, what if I wanted to start my tag with a space? Why is that not acceptable, but a space in after is?
It quickly gets into “how do we do this right without being ambiguous or counterintuitive?” and inevitably goes towards making it dumb simple and easy to explain, “tags can only be one word long.”
Mark – good observation re: the URL for spaces between tags. I’m no coder, but shouldn’t there be a separation between back-end text management and front-end user experience? If the common way to read a tag is multi-word, space separated, that should be the start. Back-end normalization strikes me as a good way to handle dashes vs. spaces vs. periods vs. etc. In your “foo, bar” example, that space between the comma separator and “bar” would be eliminated. We’re handling these well at Connectbeam, but am I missing something?
Hutch, I think it’s a trade-off: with any way you interpret a loosely interpreted list, it’s not going to conform to some people’s intuitions. If your instructions say “use a comma separated list,” it’s correct to assume that the comma is the only thing that separates the terms, not “a comma and a space.” If you say “a comma and a space,” what about no spaces? What about two spaces? Then you start getting into long disclaimers and instructions that nobody’s going to write and nobody’s going to read.
With no feedback, in most cases, on whether or not you did it right until you submit the tags, giving simple instructions that are hard to misinterpret wind up being the most usable, even if they are limiting. This could be changed if there were instantaneous feedback: Apple does it well, where it adds a bubble around tags when it hits its delimiter, letting you know right away that there’s no question the tag’s ended, but you still have to deal with the other potential usability issues related to the URL.
I am working on my thesis for graduate school this semester and this is exactly what I plan to investigate. It is an incredibly micro-level UI/UX issue, but an incredibly important issue, IMO. I plan to perform some pretty extensive usability testing on UI’s that employ various means of delimitation and then aggregate my findings into an online pattern library/recommendations for best practices for tagging. We shall see what comes of it.
Cheers-John
Pingback: posts from diigo 12/29/2010 | Sandykwok's Blog