It’s again time for a discussion about the use of the nofollow attribute for links at the English language Wikipedia. A lot of the Wikis in other languages use the attribute and also the MediaWiki platform which is used by Wikipedia and thousands of other Wikis has the nofollow attribute ON by default for new Installations.
Matt Cutts from Google was recently asked for his opinion about this and he responded that he would also like to see the nofollow enabled “but work out ways to allow trusted links to remove the nofollow attribute”. Well that is the same old we hear now time and again, but that is no solution for the problem. I spent the time and addressed some of the previous discussions and what I believe the real problem is. This is not Wikipedia specific which is the reason why I post this here. Google already did take steps in the right direction when it demoted the value of talk pages at Wikipedia. I blogged about that earlier this summer.
Here are the parts of my argumentation which I believe are relevant beyond Wikipedia.
When it comes to Links, I think that a link MUST be first and foremost be relevant to the article and add value to it. For whom? The human visitor that is reading the article. Search Engines are secondary. I will come back to that one a bit later.
The effect if some external links have the attribute and some don’t would reduce the effectiveness of the attribute considerably. If the rule is general with no exception, also the spammer in the furthest region on the planet will learn about it. It would be big news and send out a message that is today only a whisper.
“PageRank is dead!” -Â The original principle is becoming less and less of a factor for the major search engines because of massive abuse. MSN has the least developed algo’s and is the most susceptible to attacks of the big three. It was “nicely” demonstrated this summer.
But even Google and Yahoo have problems with this which are still not solved effectively. The SE’s know that and the only tactic they have today is to scare people to death to reduce the problem that way as much as they can to buy themselves time to actually come up with a solution. Don’t buy links, don’t exchange links, don’t cross link sites you own, add nofollow etc.
They could have said: “Either don’t add or at least flag any link that you would not protect with your life” or at least not “co-sign” for, because we do a bad job in determining intend and relevance of links at the moment. “We work on it, but in the meantime please help us to suck less”.
Okay, let us help them and add nofollow to any link that is not to somebody I would trust with my life. Let that become standard practice and Google will become able to calculate PageRank in real-time, because there will not be much to calculate anymore. May be that would cause the SE to increase their efforts to come up with a solution that works better.
Until then will Spammers care about links nobody sees, even if only 1 in thousand creates results. If the result is $0.0000000 after a week of work, nobody would SPAM in that fashion, unless it is for research purposes.
Spammers need then to shift their focus ENTIRELY to areas where it is seen by a human at least one of them and there must be at least a remote chance that this human will act on it a way that benefits the spammer. If you spam a site with a 100% readership of strong believing Muslims and offer delicious pork chops, your conversion will be 0, regardless if you spam once or a billion times.
The same results would have adult entertainment merchandise which involves young and pretty women with little or no clothes when promoted to an audience of 100% catholic priests.
The more it becomes targeted, the less it becomes SPAM actually. SPAM that actually benefits me is not really annoying and I will be forgiving the fact that I did not ask for it. The more the spam moves to the visible space the more relevant does it need to become or the easier it is to detect automatically without a human even seeing it. The latest Blog Spam plug-ins are a very good proof for this.
Also “learning” Email Spam filters work extremely efficient and over time almost 100% accurate. It is obviously currently not feasible or possible for the major search engines to use the same principles to solve the spam problem. If it is relevant, the spam filter will not catch it, but it also is not really spam anymore.
If the spam must become more relevant and closer to good content it must become less spam in nature. It is today already possible to detect spam that is too much off topic. Filters could be developed and be very efficient that work on the principles of existing blog comment/trackback spam filters and email spam filters and remove obvious spam automatically.
Those filters could be developed already and they would also help under the current situation btw. If stuff remains in the Wiki after all that, the validation of the provided content will happen on a very different premise than today. It would become a very healthy process in my opinion and probably increase the popularity of Wikipedia.
Search Engines are trying to get there. I am absolutely convinced about that. They don’t have a practical solution for it yet, but why should we make their life easier that they have to work less hard on the solution for them?
What are your thoughts about this?