Open Source Search Engines Give Non-Commercial Alternatives

SMS Text

Recently, as commercial search engines such as Google, Yahoo, Ask Jeeves, and MSN Search are escalating their reach of web surfers while growing stronger by the moment, there has been a reaction by the “non-commercial” crowd to bring about an alternative. Open source activists have put together Nutch search engine technology which may bring an alternative to the regular search engine field.

According to the Nutch project they provide a transparent alternative to commercial web search engines. Only open source search results can be fully trusted to be without bias. (Or at least their bias is public.) All existing major search engines have proprietary ranking formulas, and will not explain why a given page ranks as it does. Additionally, some search engines determine which sites to index based on payments, rather than on the merits of the sites themselves. Nutch, on the other hand, has nothing to hide and no motive to bias its results or its crawler in any way other than to try to give each user the best results possible.

Over the past week, three open source search engines have gathered the attention of the searching community, two of which are using Nutch and one of which is still in the idea/development stage.


This month MozDex, an open source search engine built entirely using different open source technologies, has been tweaking and refining its search results while in beta testing. While in “deep crawl,” MozDex plans on full indexing within the upcoming weeks. offers one of the first OPEN search systems based on publicly available software, APIs and algorithms, said Byron Miller, President at Small Productions. There is no secrecy into understanding the results or ranking thereof offering the first public insight into an open index.

Objects Search

Objects Search has launched a clustering search engine based on open source technology Nutch ( Clustering Engine is a system for clustering textual data.This engine automatically categorizes search results on-the-fly into hierarchical clusters.

Search results clustering attempts to overcome the problem of information overload, since most search engines are based on keyword-based queries and endless lists of matching documents. Unfortunately, even when exceptional ranking algorithms are used, relevance sorting inevitably promotes quality based on some notion of popularity of what can be found on the Web.

One approach is to automatically group search results into thematic categories, called clusters. Assuming clusters descriptions are informative about the documents they contain, the user spends much less time following irrelevant links.


According to Research Buzz, OpenIndex is not quite an open search engine project, but more of an index (as states the simple name) or a community-built search engine. Claiming that they do not have the hardware to power a huge web index (As of yet) OpenIndex is open to ideas of users who join their community.

OpenIndex puts forth the idea of a decentralized, multi-computer powered search index; “Although we wouldn’t likely have large computers available, we could have many small ones, contributed by interested volunteers, and distributed across the community – even across the globe. Perhaps it’s the only way to have a publically-owned and operated index.- it certainly seems appropriate.

A distributed system of servers would apportion all of the tasks of running an index among them. This would create a massive system of computers running in parallel, doing tasks as they are required. Costs would be distributed among the servers.”

Loren Baker
Loren Baker is the Founder of SEJ, an Advisor at Alpha Brand Media and runs Foundation Digital, a digital marketing strategy & development agency.
Loren Baker
Subscribe to SEJ!
Get our weekly newsletter from SEJ's Founder Loren Baker about the latest news in the industry!
  • Carlos Saltos

    also local search engines are need it for private content and Oxyus and other other open source local search engines may be used.

  • online poker

    Great Website! It helps me a lot with my tough homework. I’m not so hot in that class 🙂 Thanks for the hard work, keep it up!

  • poker

    Got here and seen your stuff – way to go!

  • phentermine

    I come to your site because it keeps me entertained and aware of new things.

  • AMendez Jr

    Google, Yahoo, MSN just provide comercial links as priority results. They no longer are an alternative to obtain information.

  • a.erol

    open source is good, but creativity is hiding the source! this is the dilemma of open source programming!

  • Rob

    I would like to see a search engine that is non-commercial in the sense that it does not list sites that use advertising. It seems like more and more areas of information are becoming hopelessly obfuscated by sites whose only object is to get people to click on ads.

  • Chris Manhoff

    Thanks for getting me started in the right direction. I was looking for some basic information about computer freeze ups (just informational, I’m not even having the problem) and every site I found was an advertisement for a registry cleaner. I know the purely informational sites are out there, I just need to learn how to find them.

  • Grover

    Frustrating, isn’t it? I often skip ten pages in or so to get past all the crapola that passes for content these days. An alternative — Open Source or otherwise will arise — that’s the power of the marketplace of ideas.

  • sofyblu

    totally frustrated by commercial search engine… but didn’t find an alternative here for the non-tech user that I am…

  • Chrism

    A good resource is to search the database of pdf files only. Using Google, you would go into advanced search, enter your search word(s), then specify .pdf in the file type drop down menu. This is a great ad-free informational alternative.