Wednesday, January 31, 2007

Answering my own tagging question

Yesterday I suggested that perhaps using a concordance and some sort of algorithm might be a possible approach to solving the Cold Start Problem (CSP) of folksonomies. Sometime in the middle of the night, and proving the point that there is no such thing as an original idea, I realized that I had seen this suggested somewhere before. I got to thinking about where, and remembered that I had actually blogged about this last summer. Duh! From the WWW 2006 Edinburgh Collaborative Tagging Workshop:

Towards the Semantic Web: Collaborative Tag Suggestions
Zhichen Xu, Yun Fu, Jianchang Mao, and Difu Su. Yahoo! Inc.

Here they address the CSP:

3.5 Content-based Tag Suggestions
In addition to using tags entered by the real end-users as a source for tag suggestion, we can also suggest contentbased (and context-based) tags based on analysis and classification of the tagged content and context. This not only solves the cold start problem, but also increases the tag quality of those objects that are less popular.
One simple way to incorporate auto-generated tags is to introduce a virtual user and assign an authority score to this user. The auto-generated tags are than attributed to this virtual user. The algorithm described in Table 1 remains intact. This mechanism allows us to incorporate multiple sources of tag suggestions under the same framework.

They don‘t actually say how they come up with the suggested contentbased tags (might I suggest a simple concordance?), but they do suggest assigning these tags to a virtual user, and they even share their algorithm for assigning some authority to this user:


So there you have it! We‘ll share our results next week when we‘ve got it all coded-up.

