Category Archives: Tagging

On flickr and development agencies

There’s an interesting blog post from Timo at the Red Cross about the use of Flickr to showcase the international development and humanitarian work done by that agency (thanks Nadejda on KM4dev for the tip).  The Red Cross Flickr stream is really a terrific site and well worth a visit.  Where I work, AusAID has a Flickr site too.

Timo’s blog post cites eight lessons learned from the experience of using Flickr:

1. know your audience
2. newsworthiness beats quality
3. less is more
4. understand what you want to achieve
5. use Flickr groups
6. appreciate the work of others
7. need to give solid attention to Flickr to maintain traffic
8. be careful with creative commons licensing

What is missing, and Timo alludes to this in his blog post, is that Flickr needs better integration with other applications. Timo suggests that Flickr needs to better integrate with Facebook, for example.  In addition, I think we also need to work out how better to use Flickr to tell the stories behind the photos.  I still feel that the images, words and tags are not enough to really give me a strong sense of place and story.  There is greater potential for education and learning beyond just the images themselves, albeit I know how powerful images can be in their own right.

It would be great to be able to link the photos to a short podcast, perhaps a narrative fragment from one of the image subjects, to really give stronger context to the individual images.  Not sure if this is possible, but I am certain narrative would add to the user-experience.

On pegging down taxonomy

Tonight I watched The Collectors on ABC TV – good to have the show back in 2008. One of the featured collections was the peg collection of Mike Bradley. Yes, that’s right, a collection of clothes pegs!

There were some really interesting moments (yes, truly) in this segment on pegs. Firstly, Mike Bradley was terrific at telling his story about the whys and wherefores of his collection. He postulated that his penchant for pegs may have stemmed from his gypsy heritage (gypsies introduced pegs to the world, apparently). The stories were personal, interesting, and humorous.

He told a great story about a trip to India and the purchase of some unusual pegs – the guide/translator telling the storekeeper about ”this idiot who loves pegs” and then arranging a higher price than normal and splitting the difference! Mike also related a shopping trip to “plastic city” in China where he bought a load of plastic pegs with different designs. Returning to Australia, the customs officer wanted to know about the metal clips showing up in the X-ray image. Mike replied that they were pegs he was bringing back from overseas, to which the customs officer replied: “don’t we make pegs in Australia?”.

Mike reckons he has the largest collection of pegs in the world, albeit only about 50% of the total number of different pegs out there. And he admitted to not being shy in swapping an ordinary peg for one not in his collection if he comes across such a specimen on someone’s backyard clothesline! You have been warned.

Mike turned a potentially lacklustre story into a great feature on collecting; turning the mundane into something special with his natural storytelling abilities. The storytelling worked. [Note to ABC TV - a podcast or videocast of such segments would be really worthwhile].

Then there was the in-studio discussion between Mike and Collector panelists Nicole and “The Professor” (and this strikes at the heart of taxonomy, can you believe it?). Nicole admitted to hanging clothes on the line with a pair of pegs that had to be the same colour. She wanted to reassemble and order Mike’s peg collection by colour. Mike actually ordered his collection by size and type of clip. The Professor had no such preference for peg order. And now let me confess, that when I hang out the washing the pair of pegs for each article of clothing must be of the same type – no mix and match here!

Now if there was a manual for the “correct” way of pairing pegs or assembling pegs in a collection, what would be the one-size-fits-all determining taxonomy? Would it be chronological (historical or purchase date?), or colour, or size, or type, or shape, or country of origin,  or type of use, or complete randomness (perhaps the order being determined by the position of different clothes on the clothesline itself)? You see, it all depends on what means the most to the individual, if in fact it means anything at all.

The moral of the story: don’t put a square peg in a round hole … or can you?

On tagging and the enterprise (and RSS)

I want to conclude my blog summary from the presentation I gave last week on tagging and the enterprise. The previous three entries should be read in conjunction with this instalment, if you haven’t followed the story so far…

I used IBM’s dogear as an example of an enterprise using tagging within the firm. However, instead of me explaining all about it, I have listed here three sources that explain the way in which social bookmarking and tagging may be used within the enterprise, including dogear at IBM:

Am I being lazy? Well, the web is all about links so I may as well use them!

Finally, as an aside, I discovered today a way of using RSS feeds to populate a newsletter. Yes, it is an interesting combination of web 2.0 (RSS) and the old way of communication (newsletters) but it may well work as a valuable bridge for people still not accustomed to the full array of web 2.0 communication channels. The product is Nouri.sh and it’s relatively new. It is definitiely worth a look if you want to mesh RSS content within a newsletter format.

And if anyone knows about other services like this, please advise with a comment!

On tagging, the grey side

My last two posts have been about tagging based on my presentation last week at the conference in Sydney, ”Enhancing search and retrieval capabilities and performance”.

I want to look at some of the perceived disadvantages of tagging that I briefly mentioned in my presentation:

  1. Lack of specificity – refers to the fact that an item can have innumerable headings (tags) and there is no fixed agreement as to the most suitable term. A formal taxonomy and classification system have been the traditional ways of asserting specific terms to items.
  2. Ambiguity and inconsistency – because anyone can apply a tag to an item, there will be a multitude of tags that do not clearly and consistently apply to a specific item. Some people may tag something as “locomotive” and another “train”. The same person may use “locomotive” now but three weeks previously used the term “train”. And train may in fact not refer to a locomotive at all (with or without carriages or wagons) but to a wedding dress, a series of thoughts, or to an adult education class.
  3. Lack of structure – The traditional relationship between broad and specific terms (the parent-child tree structure that historically organised information into “like things”) is not there in a tagging system. Weinberger refers to a tagging system as one that looks at the leaves on a tree rather than just the branches.
  4. Problems with stemming or truncation – words like plurals, or words with a s or z in them.
  5. Ceding control of search terminology to the “inexperienced” – using the correct terms is an important exercise not to be trifled with by amateurs and the inexperienced professional.

It is true that there will be imprecision in tag terms and inconsistency in the application of tags to items that look to be the same things. It is also true that the same individual may use different tags over time to describe essentially the same thing. And tagging might thus be perceived as a mess, needing an experienced taxonomist and library professional to make sense for us. People in the information business who like order and structure have a long historical paradigm to work from.

Yet all is not lost. Tagging will become self-refining, gradually highlighting preferred terms (perhaps through a tag cloud) or via suggested or similar headings. Collaborative tagging and folksonomies will help shape a form of group consensus leading to a meaningful sense of order. And technologies will improve to cater for some of the weaknesses of current tagging systems. One example is Raw Sugar.

Overall, tagging will continue to grow simply because digital information will grow at time-warp-like speed. The sheer scale of the digital world, and the cost of ordering that digital information, will not easily permit formal and timely classification. Just imagine trying to keep up with all the blogs in the world, let alone the individual blog posts from each of them. 

Tagging will become more important and self-fulfilling due to both the technology and the demographic changes in society, responsive to the digital world and the need to make sense in it for individuals and their peers. The changing nature of information, and the new consumers and producers of that information, means that change is inevitable.

Interestingly, a recent article highlighted the changing nature of reading – the development of an information browsing culture among the digital natives. The impact of the digital world should not be underestimated.

In looking at tagging so far, perhaps one could say we are in a period of transition from the structure and hierarchy of giving order to physical information (like books, journal articles and celluloid film) to one where digital information allows for innumerable access points, innumerable tags and descriptors, and seemingly available from anywhere.

[Of interest, check out this podcast from Beth Jefferson on transforming public libraries' online catalogues into environments for social discovery of resources that are catalogued not only by librarians, but also by patrons. A salient quote on social cataloguing - collaborative tagging if you like: "the metadata people create by cataloguing content is what enables social search and discovery". Beth Jefferson wants to enhance social search and discovery across North American public libraries through collaborative cataloguing, whether by evaluative comment or by description. Tagging and thesauri may indeed coexist.]

So the question remains – is the traditional way of ordering information and establishing a single authority for fixed terms appropriate in the modern digital world? And practically speaking, what is the right balance between order and miscellany in any given context?

I will feature one more blog post on the tagging issue looking at how the enterprise (the firm, not the fictional space ship), might take to the tagging phenomenon. Stay tuned…

On the positive side of tagging

In the light of what I discussed yesterday with respect to my conference presentation on Tuesday, I want to move on to tagging. Tagging is essentially unstructured metadata that is assigned by the content creator and the readers/users of the content, the latter called collaborative tagging. The user-generated classification that emerges is called a folksonomy.

Examples of digital content using tags include de.licio.us, Flickr, LibraryThing, Technorati, and Youtube. Even the web-based news services are using tags, like the ABC in Australia.

In addition to the tags themselves and the act of tagging content, a collection of tags into a group showing relative emphasis or popularity is called a tag cloud.

There are a number of benefits from using tagging and they can be broadly summarised as the following:

  1. terms meaningful to the content creator and/or readers (and not just those terms allowed by a single classification authority)
  2. establishes relationships between content and the people connected to the content (both content creators and readers)
  3. is inexpensive to undertake, especially in relation to traditional cataloguing and thesauris construction
  4. scales exceptionally well, thereby suiting the miscellany of digital space
  5. aggregates especially well, thereby harnessing the so-called wisdom of crowds
  6. permits multiple access points to information instead of just bibliographic data
  7. permits discovery of a range of other items tagged by other content creators and readers
  8. overcomes the lack of currency when using traditional fixed forms of metadata (like the established classification systems)
  9. is highly participatory in that people freely choose the relevant tags they regard as appropriate to their own content and to the content of others
  10. as more applications make tagging available, and as the new digital generations increasingly enter the workforce, tagging will become the established norm in the digital information environment (we can see how blogs may offer such an opportunity)

Point 10 is especially important. There is already some evidence of tagging popularity from a Pew Internet Report showing that nearly one-third of US internet users tagged content. As tagging becomes more familiar and mainstream, new opportunities will open up to enhance the popularity of tagging – what I have called “the tagging locomotive”.

I’ll stop here (but with another post to come) with some recommended readings:

Everything is miscellaneous by David Weinberger

Ontology is overrated by Clay Shirky

Folksonomies: power to the people by Emanuele Quintarelli

On search and tagging

Yesterday I gave a presentation at the Ark Group conference, “Enhancing search and retrieval capabilities and performance”, in Sydney. The presentation, called “Tagging and the enterprise”,  is available to conference attendees and I am rejigging some of the slides to load up onto Slideshare.

There were two key points I tried to emphasise yesterday in a conference context that discussed taxonomies and search in great detail.

The first was this: that having been brought up in a world of library-based classification schemes at school and university (Dewey decimal classification scheme), of thesauri and controlled vocabularies, of ordering and searching  for information using the structure of tree hierarchies, I was a typical information-order kind of person in my profession. Working with these established and authoritative structures was in fact the norm.

Yet I wasn’t completely satisfied that this structure always helped – sometimes I couldn’t find the information I wanted and nor could the people I was supposed to help. In fact, sometimes other people (not the authority) had better ways of describing and classifying an item, to which a cataloguer associate of mine would scream, “if it’s not in the book (subject headings), it’s not the correct term and I won’t be using it!”.

In fact when one thinks about, the photo below of steam locomotive 3801 charging through a train station means different things to different people: a Japanese tourist might (incorrectly) classify it under bullet train, while a railway historian might prefer the term ARHS enthusiast special, or the stock photographer might use 3/4 view as a suitable classification term. The point is that we could search using terms like these that make sense to us individually but get nowhere because Dewey or Library of Congress says so.

 3801 on ARHS Newcastle Flyer Special 2007

Moreover, the history of my everyday experience has been one where I ascribe my own, personal, and context-driven classification schemes. They have been informal and functional for my needs. I make up my own mind how I sort the dishes, how I arrange my digital photos (well, trying to make up my mind), and how the groceries in the pantry are arranged (even making sense of putting the jar of Vegemite on a small shelf in the kitchen with the cough medicines and cat worming tablets, instead of with the jams and condiments, to ensure that I find it). We try to make order out of complexity that gives us – the individual - meaning. Yet the world is a complex place.

The second point was that the demographic and technological changes in recent times have ensured a generation (or two) of tech-savvy people whose norms are those of identity, connection, collaboration, and peer relationships managed and articulated through digital space. This has a major impact on how information and knowledge is used and sourced. The implication of this demographic trajectory suggests an acceleration into the workforce of people whose norms are couched in the digital space.

The digital space changes our traditional way of looking at information since information is essentially everywhere and not bounded by the physicality of the book or the library. The incoming, tech-savvy generation of digitally connected workers will continue to be part of that change and will (more than) likely increase their participation in it since that is essentially becoming the norm.

Tagging will be one manifestation of this.

I will elaborate on the tagging issue tomorrow.

On narrative, sensemaking, and volunteering

I did promise on Saturday that my next blog post would be on narrative, sensemaking, and the volunteering project. However, Doris Lessing did come between posts with an earlier blog post this afternoon.

Looking at my notes from the debrief from the volunteering project on Friday, I took this point from Dave Snowden’s introductory remarks on complexity and sensemaking, and the wisdom of crowds: distributed cognition is all about the wider network of individuals from which the capability of finding out is greater than the individual on their own. One reason for business to seriously consider distributed cognition with people networks is the need to do more with less resources. By using networks, knowledge can be leveraged more efficiently.

Another key point related to “weak signals” – if we don’t expect something, we don’t see it. Dave showed the basketball video and I won’t steal his thunder with a link, albeit I have now seen Dave present this video three or four times now. One of the problems in looking at something is that we often overlook vital bits of information that, at the time, don’t seem relevant. The human brain actually filters out much of what we perceive in order to avoid overloading our brains with too much sensory perception.

And, it wouldn’t be a Dave Snowden talk without the mention of pattern sequencing. Humans rely on patterns developed at a young age from which the brain forms preferences related to stored experiences and information. These patterns are modified over time so that, for example, consistent patterns emerge to explain behaviours in standardised contexts. Patterns are fractal in nature from which we filter our perceptions (stereotypes are one manifestation of this). A consequence of this is entrained thinking, where our experiences and perspectives and personal learnings tend to override anything that is contrary or different. The human brain and perceptions of the environment exist as a complex landscape from which decision-making takes place. And this in turn gives us meaning.

Something that will help us is a system for the natural process of inquiry – human-pattern processes, not information processing. Information processing is too structured and more specific to time of analysis. Alternatively, narrative techniques are more useful and more relevant since they convey meaning (from the viewpoint of the person telling the story) and they relate to context. The telling of stories and the identification of meaning ascribed to them by the tellers of those stories are powerful sources of meaning, especially when aggregated. In addition, the telling of a story is a more human communication method developed over milennia whilst information processing is a 20th century phenomenon.

For more information, I recommend the Kurtz and Snowden paper on the new dynamics of strategy: sense-making in a complex and complicated world.

And the volunteering project? Well, using the sensemaking software developed by Cognitive Edge, stories are being elicited from people who volunteer or manage volunteers in the community services sector. The project hopes to capture one to two thousand stories between now and February 2008 (almost one thousand have already been collected). The aim is to discover what people believe to be the benefits and barriers of volunteering in community services so that effective policies and strategies can be put into place to help and encourage more people to take up volunteering.

The debrief on Friday was a work-in-progress and focused on the largest data set, the 17-59 year old demographic. The software powerfully showed a range of relationships but more stories are required to enhance scale and emphasise trends across all ages. One of the interesting outcomes from the survey so far has been the identification of post-graduate educated, full-time working people as the biggest group in the community services volunteering pool.

So far, contributions have come via the survey on the internet but a phone-in is scheduled for February 2008. More details later.

In conclusion, for me, the session reinforced the importance of meaning and context in understanding the complex space in which humans work and interact.

Download future_of_volunteering_overview.doc

On information research

The latest issue of the e-journal, Information Research, is now available.

There are some really interesting papers, especially the paper by Marcia Bates on browsing behaviour and the paper by Judit Bar-Ilan on librarian blogs.

There are several book reviews too, including this one on David Weinberger’s book, Everything is miscellaneous (a book I am currently reading).

 All in all, a range of articles and reviews well worth a look!

On tagging (3)

Chance encounters often reveal positive results. I came across this November 2006 blog post by Joshua Porter on why scale matters in tagging systems.

A point I want to tag onto (pun intended) is the one about the rights of the individual to tag anything with any tag the individual likes. Joshua illustrates with his comment about the New York Yankees. Some people will say (and tag) that the New York Yankees are the best baseball team in the US and some will disagree, and some (like me) couldn’t care less - test cricket is far superior!

Joshua says: “Even if a few people tag things incorrectly, most people won’t. This doesn’t have to do with the fact that most people are Good, it’s just that if we ask enough people the same question or have them observe the same phenomenon, where their experiences overlap will tend to be the reality of the situation” – the wisdom of crowds phenomenon.

Actually, it is the same argument promoted by the free-market economist Adam Smith with his concept of the invisible hand – each individual acts to maximise self-interest but in aggregate, society benefits. But does this really happen in practice – if the majority of people in rich countries want to continue to pollute the planet, is this a good thing for society or not?

But what has this to do with communication and knowledge management? Well, besides the tagging phenomenon itself, the concern in this aggregation and crowd argument is that opinions and thoughts that lie outside the “consensus” view are too easily ignored.

We also need to listen and hear to what people outside the crowd are saying because all too often, there is something special and innovative there that the pack of individuals in the crowd missed or hadn’t thought of, not to mention the danger of Groupthink!

And knowledge management needs to deal with both the consensus view and the outliers. How we can do this effectively all the time is indeed a challenge.

On tagging (2)

I previously made some comments about tagging. I believe tagging has its place as does controlled vocabularies. John Udell’s blog post yesterday on tagging and foldering made the point that: “On the desktop as well as on the web, we’re in the midst of a long transition from container-based to query-based storage and retrieval”.

In container-based storage one looks for what you want by going to the container and looking to see what is in it. In a search-based world, the container is irrelevant so long as access to the contents of the container can be searched, made even more powerful by being able to search across multiple containers. And even the notion of containers is becoming obsolete as digital content becomes miscellaneous.

Interestingly, I was left thinking about the notion of access points that we looked at years ago in my librarianship training. Traditionally, access points were different ways of accessing a library catalogue but now access points relate also to the digital domain. The fact is that now we can have an enormous number of access points and these access points can now be determined by users with user-generated content and tagging.

The Udell blog post reignited some thoughts on my own plans for my home digitisation project to convert several thousand hard copy prints and slides into digital images. The workflow includes using cataloguing software for categorising and searching my photo collection (where are the digital images located on my computer and external drives and what terms will I use to be able to search and find the ones I want?).

The issue for me is that I need a controlled vocabulary to ensure consistent and accurate description and searchability of my photo collection. In addition, I will be undertaking this catagorisation myself so there is little benefit gained from tagging since I am not saving time by having others do the categorisation for me. And certainly, there is no user-generated aggregation as there could be if I used Flickr as my host and archive.

And this is the point: tagging works best in aggregate for two reasons: Firstly, aggregation enables some semblance of preference that gives a general consensus from which patterns emerge (folksonomies) – a kind of user-generated thesaurus. Secondly, tagging works because aggregation also takes place at the actual labeling end of the workflow – individuals tagging upon production and subsequently by use, a scale issue that traditional thesaurus-based cataloguing cannot compete with. In other words, there is so much digital content out there that changes all the time that a consistent, centrally-determined traditional classification scheme and workflow is impossible.

But at home, I can generate my own controlled vocabulary to ensure accuracy and consistency across my photo collection, make reference to it for future additions, and find what I want in a reliable manner. If I was tagging, in the end I would probably have a defacto controlled vocabulary, but something less than consistent and no more meaningful.

The future may yet bring, however, the opportunity for improved tagging that generates greater consistency and reliability while still maximising scale. Even so, for my home project that’s not needed.

On tagging

I have been giving some attention of late to tagging, partly because of some research I am doing for university, and partly in response to a challenge Matt Moore gave me a while back to start putting some of my photos up on Flickr.

A key feature of Flickr is tagging, but tagging has become much more widespread. US research indicates that tagging is a popular user generated activity with 28% of internet users having tagged online content.

Thomas Vanderwal has written a great post on tagging. In it, he describes the history and current state of tagging and what improvements he’d like to see (stemming to see different versions of the same word, for example).

What I find interesting, coming from a background in librarianship and functional thesauri, is that there now seems to be more interest in organising tags so they become more meaningful and less ambiguous. Ambiguity is a real issue for modern libraries, particularly structuring folksonomy tags in public libraries.

Tagging works well with scale because scale gives weight to more popular tags than others. Popularity of tag terms becomes the defacto preferred term that a thesauri might recommend under a controlled vocabulary environment. However, popular tags may have even greater weight and value if the same tags are agglommerated with like tags (tags that are either similar or the same, using a different word or spelling for example).

One initiative that has some promise is FaceTag, a semantic collaborative tagging tool, described in a recent article in ASIS&T Bulletin. It’s early days but FaceTag may be on the right road in looking at relational and heirarchical issues within tagging folksonomies.