Funny smiling vampires?

One of the things we like to do here at CueSense is spotting emerging trends. Tags and tag frequencies are a promising area, and we are happy to share a few interesting things we learned recently.

A tag’s frequency is the number of cues tagged with it. For example, if you added three bookmarks and tagged them with the words “portugal vacation”, then the frequency of that tag would be “3″. The graph below shows the tag frequencies in CueSense in December 2008. The shape of the curve is distinctly skewed right with a very long “long tail”. We had a small number of tags with lots of cues and a very large number of tags with just a few cues.

To understand what is going on here, we divided tags into two buckets: tags with frequencies larger than 1000, and tags with frequencies between 1000 and 10.

The large-frequency tags are clearly common words such as video, news, or music. The only exception here are the automatic tags used by tagging services, which tend to have large frequencies as well. If you would like to help us making CueSense better, refrain from using common words for tagging your cues. Since there are so many of them already, we would have to filter these cues out.

Tags with 10 to 1000 cues show a similarly consistent distribution. We attached the distribution graph and the histogram for these tags. The first graph btw gave inspiration to the title of this post.

Why does all of this matter?

Our newsfeed filter includes an analytics engine that uses tag frequencies to find the best cues for you. It can also spot fast-rising (“hot”) tags which we share in the Trend panels. We will report more of our findings as we tweak our matching algorithms, so stay tuned.


No comments yet.

Leave a Reply