SEO

Textonomy : Semantics Based ‘Sense Engine’

Semantics based search is the flavor of my articles and in this one I’ll be focusing on the Textonomy suite of products from Crystal Semantics, now acquired by the e-advertising firm ad pepper media.

The company was co-founded by Professor David Crystal, a world authority in linguistics. The textonomy product has at its heart a “sense-engine” , a product of more than 8 years of linguistic research that evaluates the semantics and stylistics involved in word usage. While semantics is the common ingredient for all meaning based search engines, stylistics refers to the localized usage of words (contextual distinctiveness).

Disambiguating the Textonomy search scheme

The origins of the linguistic project date back to the need for classification of data for the Cambridge encyclopedia. The requirement extended to incorporate data from a number of other encyclopedias and the huge database of information served as a taxonomical repository to marry the dictionary definitions with encyclopedic classifications. The result: an engine to compute the contextual meaning of words by relating the dictionary words to encyclopedic categories.

At the core of the engine are three components

  • A page analyzer that analyses HTML content and extracts data to be sent to a “black box for categorization”.
  • A black box that matches the text on the page with the categories in the taxonomy (upto 2500 categories) and the categories are ranked according to the usage of the words.
  • The reporting interface that can present the data in a user defined or XML format to be used to place ads or generate results as required by the client.

The company is headquartered in England and the products are available for advertising companies to enhance their offerings for relevant ad placing. Recently the company has made available technologies that operate both at the server and client side (effectively addressing both ends of the Ad delivery spectrum)

Linguistic search needs clear differentiators

As mentioned in my previous articles, search oriented companies need to increasingly focus on integrating non-text based meaning engines as more and more content online gets media rich.

There are a number of firms leveraging linguistic techniques for targeted advertising, but a major differentiator would be a firm that targets not just meaning in text, but meaning in media as a whole.

Comments are closed.

2 thoughts on “Textonomy : Semantics Based ‘Sense Engine’

  1. Arun,

    Very interesting series of articles. Have you seen SenseBot? It is our semantics-based search engine, which would fit into the group you’ve been describing. The key differentiator, though, is that we introduce a new concept of a search result – a textual summary of all top relevant pages, e.g. an overall digest on the user’s query topic. The summary itself becomes the main result of the Web search, as opposed to snippets of individual pages.