AnthoBLOGy

Subscribe to Site Feed
RSS Feed for AnthoBLOGy

 

Vikas Kamat
 Vikas Kamat is a programmer- entrepreneur living in Atlanta. This blog is a complex mix of Indian culture, life in southern USA, computer sciences, and sports. Opinions are his own.
 About - Bio - Contact


Friend Me on Facebook

 

Best of AnthoBLOGy

Unripe Revolution
Rooster's Dharma
Don't Know Jack
No Love for Condi
Blogger's Block
Father of the Bride
TinTin's Diary - I
TinTin's Diary II
Hate Bollywood
Child Labor
M.F.Husain Guilty
Marathi & Konkani
Artist's Daughter
India's First IT Guru

 

Computing, Libraries, Tennis, India & other interests of Vikas Kamat

Computing with Tags Durable Link to this BLOG
I am happy to add a new "Browse by Tags" feature to Kamat's Potpourri.

We have internally supported tags from 1997 (we called them CrowWords, a mock of keywords, after my company CyberCrow), but they were designed for machine-learning rather than for humans.

Anyway, the addition of tags in addition to keywords and crowwords gives us a rich content platform classify and organize the website.

It is indeed fascinating to see the disparate content items come together.

Hope you like 'em. Examples: Hampi, Hindusim

A Note About Tag Generation

It is wonderful if you have the luxury of human-tagged contents (like Flickr, Technorati or Delicious have). But what if your content is not already tagged?

That's when Keyword Extraction comes into play. We had implemented a rudimentary extractor for our -- if a letter was capitalized, it was given a weight and a tag was started. I don't have any research or study on how effective it has been.

Then Yahoo introduced an Extraction API. In my experience, it is a wonderful service (if you can tolerate the 503 s it seems to throw often), and we have supplemented generated tags whenever human-provided tags were not available. I had to make much tuning, because most of the tags Yahoo suggested had "kamat" or "india" in them. But this automation allowed us to gather some 192,000 unique tags!

After gathering the tags, I build a matrix to view the tag density. This exposes both the strength and weakness of our contents, and helps in generation of Tag Clouds.

Generation of Tag Clouds

The coma delimited tags are transposed into rows using SQL and are ranked based on repetitions. The physical proximity is weighted in.

Having all this infrastructure, the generation of a Tag Cloud is then simply querying the database with various SELECT clasues.

A Sample Tag Cloud:

beauty - blog - diwali - games - goddess - huntress - india - lost - photo - photographs - potpourri - wheels - zoo - accidental art - ant - aperture - beholder - child delivery - color yellow - colors - cricket - cultural anthropology - destiny - friends - god - illustrations - jewelry - kamasutra - mirror project - mysore - opaque - photo album - prehistoric - professions - refugee - seductress - spokes - torn - tri

(Comments Disabled for Now. Sorry!)First Written: Wednesday, October 4, 2006
Last Modified: 10/6/2006 2:55:25 AM
Tags: tagclouds

Browse More Entries

 

About Me:

SimplyBlog

Powered  NOT by Blogger or MovabaleType or WordPress, but by SimplyBlog, a software I wrote to create blogs.
See details of implementation or download SimplyBlog.

 

 

Dictionary Look up

Kamat PICTURESearch

Kamat Glossary Search

Kamat BLOGSearch

Amazon Search

News
BBC News
Google News
Kamat News
NewsLogic
Blogs 
Amma
Indian Blogs
AutoBlog
Blog Network
@Kamat.com
What's New
What's Old
Frequent Visits
Dave Winer
Birmingham Local
Facebook
Atlanta Tennis

 

This is how I surf the web. Turns out creating your own start page beats all portals, back-flipping, personalized corporate pages, and book-marking tools.
Kamat's Potpourri Vikas KamatBlog

© 1996-2022 Kamat's Potpourri. All rights reserved. Do not reproduce without prior permission. Standard disclaimers apply

Merchandise and Link Suggestions

Top of Page