What do Stanford grads Anand and Venky have in common with Google founders, Larry and Sergie? One, they were batch mates. And two, they are into the prolific and perhaps most competitive of web domains, Search.Kosmix.com, the topical search engine, founded by the Indian duo attempts to resolve the Gordian knot of “meaning” in a search query. And the technology that they believe will do the work is “Categorization” of results into verticals and searching the web for links relevant to the particular vertical. Categories currently available at the site are health, travel, autos, finance, video games and very interestingly, politics (alpha release).
Though they don’t (atleast openly) claim to attempt to dethrone Google, the history of its founders makes for very interesting reading.
So what’s under the hood of Kosmix?
While the exact formulae is their sacred Intellectual Property, what the Kosmix folks term “Deep Technology”, scourges the web for content (not just text) specific to a category and ranks the results using a categorization algorithm. Rank is determined by matching “meaning” of a link to search query. Point to note is that “meaning” is measured as the extent to which links that contain similar content point to a link. A link can be meaningful even if it does not contain query keywords. This makes Kosmix ideal for topical search, where you know what category your query classifies into and you seek more information.
So, how’s Kosmix different from Google?
For one, as termed by Mark Johnson, Product manager at Kosmix, the engine does “informational” search, giving you links to data around the topic you search (different points of view). For example, searching for travel tips to Hawaii means you could be interested in sight-seeing too.
“Google’s second button is named “I’m feeling lucky.” Google’s goal is to figure out what you want in the first result (or at least the top ten) and deliver it to you”.
Google is “navigational” search which is best when you know exactly what you want. The whole point is to reduce the number of clicks it takes for you to arrive to the desired data. And that’s what Kosmix intends to do.
Also, Google doesn’t do categorization at interface by default. So, that’s another plus. And, yes the areas targeted by Kosmix ( Health, Travel ) are highly spammed categories. So less spam results deserve credit.
The Future : The founders recently received approval for their patent, a hybrid human/computer system to optimize computer efficiency with human assistance, filed while they working with Amazon.
Extrapolating on the patent, the combination of the users of the engine and main servers with the tasks being measurement of relevancy of results (by click-throughs, time spent on a link, their deep technology) the end-result could be a massively scaled system that depends on human generated relevancy ranked by servers in real time.
I speculate that the perfect blend of man-machine is to leave relevancy to humans and ranking to computers. Will Kosmix’s formulae be perfect for “meaning”? Only time will tell.