Why Mahalo will Fail and the Problems with General Search

SMS Text

Jason Calacanis talked at Gnomdex 7 in September 2007 about the days on the internet when only a handful new site appeared on the web every day and could be announced via an email newsletter and checked out by everybody every day.

The sites were generally good and rich on content. Junk was the exception, so was email spam, comment spam and other worthless content.

Yes, everything is nice and clean, if it is small, non-mainstream and not commercially exploited. It is also easier to manage, organize and to know each other. If you can list everything there is on one page or even a few pages, you don’t get problems. The page evolved to a directory with more and more categories. The directory was called Yahoo! and it worked fine for a long time until the growth of the internet was faster than it was possible to keep up with it and list everything in a directory. It also became harder and harder to find anything that you are looking for as a user in this giant directory with hundred thousand and more listings. The Dmoz directory was able to keep up for a while due to its open nature and vast number of volunteer editors, but even that was not enough at some point in time anymore.

Full text and crawler based search engines became more important while the internet grew and grew. Those search engines were not edited by humans anymore. Algorithms replaced editors and enabled at the same time manipulation and cheating.

A full text search engine is like a web directory with millions of top level categories. Each search term is like its own category in a web directory. While it “has” incredible many categories, which are easily accessible, if you know about them, are they also very limiting at the same time. You cannot browse them and they are not structured like the hierarchical categories of a classic web directory. Unlike a classic web directory, a typical search engine result page does only have 10 entries. Web directories might have 1-50 or more entries for a category and show all of them on one page. People who had sites listed in very popular categories knew that a higher listing towards the top is better than a listing towards the end. Web directories often sort the entries for a category simply alphabetically, site names that started with “A” and later even with a number became very popular and the favorite practice to “skew” directory listings.

Directories started to experiment with ranking entries based on quality rather than just alphabetically but full text search engines really made this aspect part of the core of their business. Early means of qualify sites were some forms of rating mechanisms. It became clear quickly that it is very hard to implement a rating mechanism that is hard to manipulate by people who try to trick the system to get their stuff ranked higher artificially. Financial incentives of doing so were and are to this day a strong motivator to attempt tricking any of such systems.

Modern crawler based search engines continue to try to rank results based on quality, even though the mechanisms used to “rate” the quality of listings became much more complicated over the years. It became more complicated to make it harder for people to manipulate the rating results. The incentive for attempting to manipulate rankings increases over time as more and more people go online and use the Internet and search engines as the means to find stuff.

It does also get harder for legit and honest content producers to rank in those results where they should at the same time. Also more quality content becomes available that fit each of those millions of “categories” or search result pages. Short, easy to remember and widely used “categories” are getting more and more crowded. The number of listings for each of those categories remained only 10 though.

There are no “sub categories” in modern search engines that allow users to drill down further into the details the current category, although search engines started to experiment with “suggestion” features that serve a similar purpose.

Add to that the fact that modern day search engine ranking algorithms are using a large number of factors for their ranking score, which are for the most part still dumb, technical and artificial (even though their purpose is to emulate intelligent human behavior). The more factors a webmaster ignores, the lower his ranking score will be, factors that do not have anything to do with the quality of the actual content itself, but how it was published on the internet.

White hat and legit search engine optimization is about making sure that those technical factors are being met properly to allow the content itself rank well where it deserves to rank. Of course can this be taken further to manipulate factors artificially to increase ranking of content that does not deserve it. The more artificial manipulation is done, the more questionable it gets and the pure and legitimate optimization turns from white hat over gray hat to what is called black hat SEO.

The epitome of black hat SEO is to get worthless content that is garbage and of no use to anybody rank high in categories that are already crowded enough with good content. It is also called spam.

Declaring SEO as a whole to spam and say that it only is an unethical method used to skew search engine results is not only wrong, but also irresponsible, because it causes people to believe that they will rank in search engines appropriately where they should with their content, regardless how they publish the content on the web and what else they are doing with it. This is bulls* as it is to say that SEO is bulls* as well.

The problem with SEO is that there are no defined lines between what is good and ethical, questionable and flat out bad and selfish. The search marketing industry failed so far to establish rules that define good practices and outlaw bad practices to allow players who play by the rules to distance themselves from unethical business practices. This lack of self regulation hurts the industry as a whole and might even more hurts if the government steps in to define the rules for everybody based on their limited (and often wrong) understanding of the realities in this business. Direct marketers for example were so full of themselves and unable to do something on their own that the government stepped in to force the CAN-SPAN act upon them. The idea was to reduce email spam, which is to a large degree caused by unethical direct marketers. It failed miserably, because the government did not know better. The spam was not reduced, but the regulation caused a lot of issues for legitimate direct marketers who uphold ethical business practices.

Affiliate marketing has the same issue, probably even worse than it is for search engine marketers. This does not make affiliate marketing BS, nor useless nor every affiliate marketer a selfish, spamming and Internet polluting piece of useless faeces how Jason Calacanis was basically summarizing the industry during his keynote at Affiliate Summit a few days ago.

Jason, your Mahalo project will not solve the problems. You cannot be comprehensive, up to date and free of manipulation and skewing at the same time. You might at the beginning while everything is small and limited in size, but it will change more and more while you grow and diversify. You cannot remain clean and complete at the same time. I made recommendations for some niche subjects at Mahalo where no content exists today, which was not reviewed yet. Some of it was recommended over 50 days ago, but is fortunately timeless content and not related to current events. What do you think will happen if your project multiplied in size over the coming months and years? It will only get worse.

segrave.gif

Nobody did figure out a solution for the problems that come with general search and I don’t even know if there is a solution for it at all. I personally believe that the future will be specialized search and segmentation. General search will change and become the means to find those specialized search engines and segments. Those specialized search engines will be embedded in highly specialized social communities and be maintained by members of that community to ensure the highest possible relevance of results within that particular niche. Some niches will do a better job than others and multiple competing communities will thrive to make their results better than their competitors and improve the quality of the results by doing so, naturally.

The amount of fragmentation will be different from subject to subject and be determined by the size of community that identifies itself with that specific subject. It will be a very dynamic space, which makes it important that general search engines will do a good job in identifying those groups early and accurately. Since the volume of content to review and rank in each of those individual engines (including the general ones) will be limited to a manageable size (or breaks up into multiple more, if it grows too large), quality can be controlled and kept high, while skewing it will be virtually impossible (unless the whole community is skewed as a whole).

This is my personal prediction of the future of search. What do you think will happen? Is there a solution for general search to be comprehensive, up to date and free of manipulated results? Please use the comments feature below to share your thoughts about this subject.

Cheers!

Carsten Cumbrowski

Carsten is an entrepreneur, internet marketer, affiliate and search engine marketing advocat and a “free spirit”. Find out more about Carsten.
“Complaints without suggestions for making it better are useless.”

Carsten Cumbrowski
Carsten Cumbrowski has years of experience in Affiliate Marketing and knows both sides of the business as the Affiliate and Affiliate Manager. Carsten has over... Read Full Bio
Subscribe to SEJ!
Get our weekly newsletter from SEJ's Founder Loren Baker about the latest news in the industry!