Semantic Search · SEO

Understanding Semantic Search and SEO

A framework for semantic analytic centric content

Regardless of what I’ve said about the whole ‘LSI’ and Google crap in the past, one things worth bearing in mind is that all modern search engines do semantic analysis to one extent or another. It may be phrase based, using PLSA, HTTM or a hybrid. That part is really inconsequential. That is important is that we can take heart in the fact that content that is semantically flexible will do a better job of targeting the page in question.

Semantic Search Understanding Semantic Search and SEO

First off, some common concepts worth looking at; semantic search is NOT semantic web. This is one area that seems to get convoluted all too often. We’re not talking about tagging. We’re talking about the probabilistic/statistical approach to understanding concepts/meanings of a web page/document.

The next thing to try and get away from is that it is only synonyms that play a role within these concepts.

Building out concepts

All too often I see people talking about stemming and synonyms. That’s only partially true. We also want to work on using terms that build out the theme/concept which we might call ‘supporting terms‘. That means we can consider;

  • Automobiles
  • Cars
  • Autos
  • Vehicle
  • Auto
  • Car

Do not be limited to delivering only those signals. We want to go further into creating a deeper theme for that space including supporting terms such as;

  • Engine
  • Garage
  • Tires
  • Hood
  • Spark plug
  • Keys
  • High Performance

And phrases related to or containing them.

As we can see, those aren’t synonyms but supporting words or phrases that further establish the semantic concepts on the page. But we’d likely be more specific in our targeting with additional elements such as;

  • Reviews
  • Sales
  • Rental
  • Insurance
  • Prices
  • Specifications

We can look at transactional and informational modifiers as well. This helps define the type of page that we have. And the type of queries we are targeting. Or for another example some possible terms for; ‘space shuttle’

  • space
  • shuttle
  • mission
  • astronauts
  • launch
  • station
  • crew
  • nasa
  • satellite
  • earth

Getting the picture here?

What we’re looking to do is create a strong semantic theme of what the page is about through the words we’re using to frame it. If one searches for Jaguar‘ they have a few options to choose from,

  • A Car
  • An Animal
  • Football team (US)
  • Computer Application

By using semantic themes you will enable the search engine to better understand the concepts on your page. Remember, search engines have about a 6th grade reading/understanding level. We need to play nice with them.

Semantic Search 2 Understanding Semantic Search and SEO

Elements search engines may look at

The interesting part about using semantic signals/approaches in search is they can give a wealth of information by analysis of such elements as;

  • TITLE of page
  • Content of page (phrase ratios)
  • Prominence factors (Headings, italics, lists)
  • Anchor of inbound links
  • TITLE and content of pages linking in
  • Spam detection
  • Duplicate content detection
  • Personalization

Each of these can be weighted/dampened to give an over-all page relevance score which can then be send to the rest of the processing system. This scoring is based from the current seed set of documents in the system which has a learning mechanism to continually refine the algorithms.

Ranking the pages

Of course the obvious question remains; how are these signals used? In the more common implementations out there machine learning is the call of the day. The search engine would start with a seed set of documents that satisfy a given term/phrase ratio, similarity measure and compare other documents to those for future scoring. Then, using various signals such as query and click data, they can further refine the seed set on the fly.

This would ultimately be combined with other relevance scoring mechanisms and core rankings set to whatever threshold they deem to deliver the end results. While this may not be enough to garner great rankings on their own, they are likely useful to those playing grab and hold via the QDF (query deserves freshness). Any non-link velocity related signal would be at a premium in such cases.

 

Putting it to use

The first thing we want to do is expand on our keyword research to provide not only primary and secondary targets, but also get into semantic support terms and even semantic baskets. This will be endlessly useful for content development, site audits, link building and more. Given the many signals that can be had, having these concepts integrated into the entire SEO program can be invaluable.

When you do this at the beginning (during the KW research) it can be easily fed into every other aspect of the SEO program.

There really are no tools nor can I imagine one that would work, (although I did talk to the WordStream gang about it recently). But it still is an art more than a science. You see we don’t know the relevance scoring for the seed set and the SERPs are inclusive of other ranking factors. I have found it an interesting excercise to measure occurances on pages ranking top 10, with the least amount of link juice/authority. While not perfect, it oftens brings concept rich pages.

Getting into the mindset

As with many things in this thing of ours, it is something you need to get a feel for in the query space in question. What is important is getting into the habit of watching how you’re framing the content. Build around the core term with not only modifiers (geo-local, informational, transactional, plurals) but also with related terms that expand on the concepts.

Now, before I leave you, I dug up a ton of tools, post and even seminars to get you into the groove. Get a feel for how search engineers think and you will find getting actionable ideas all the more efficient.. I hope you got something from all this, it is an area not often discussed enough.. Enjoy!

/end adventure

 

Tools to play with

  • Aaron’s tool has some interesting ‘Phrase Match’ data, but it is marginally effective for this excercise and would need sorting.
  • KW Map is interesting, but also is marginally effective and has no export option to speak of. Close, but no cigar
  • Vseo Tool – Also not the greatest, but certainly presents some reasonable semantic concepts and can be exported.
  • WordStream – also comes close, (I am helping develop a tool tho) but nothing default to really group deeper semantic relations for our purposes. Emails the list to you for sorting purposes.
  • Nichebot – these guys almost have it with the poorly named ‘LSI’ tool. This produces probably some of the best lists for our purposes. Fully exportable for sorting.

 

Googly Tools

  • Keyword Tool – about as use(less?) as the others. It has some insights, but not deep enough for this excercise. Although it is easier to sort and does support downloads
  • Search-based Keyword Tool – not as good as the above KW tool in the testing I did recently for this. It does support exporting though.
  • Google Sets – this one isn’t obvious right away, but handy. If you look at the ‘description’ element, you can start to see some supporting terms that might come in handy (since Googly is recommending them). Problem is that it doesn’t give results for granular/obscure terms.(also try Google Squared)

 

Semantic relations

  • Onelook reverse dictionary – returns the list of related terms, each word linked to its definition (more tricks from Ann here) – does a reasonable job but doesn’t have export function.
  • Reference.com reverse dictionary – clusters related terms into groups by their meaning and gives the actual definition for each cluster: barely usable.
  • Rhyme Zone – define your term and find rhymes, synonyms and antonyms. Using the ‘Find related terms’ option you can get some pretty usable lists, unfortunately they are not exportable.

 

Good Geeky Reading

Posts

Google Patents

 

Microsoft Patents

Videos for Geeks

378d367a36415dda47fe25d5c4538226 64 Understanding Semantic Search and SEO
David Harry is an SEO and IR geek that runs Reliable SEO, blogs on the Fire Horse Trail and is the head geek at the SEO Training Dojo.

You Might Also Like

Comments are closed.

24 thoughts on “Understanding Semantic Search and SEO

  1. Fantastic analysis! I'm surprised though, no mention of the new Google Wonder Wheel? I think this could be a useful tool/resource in determining which words/phrases Google thinks/believes are related/similar.

  2. Hi Dan, actually, as I touched on, there really were no good tools out there for this type of analysis. I did though have the guys from WordStream hosting an SEO Dojo chat last Friday and we did discuss building such a tool… and they've asked me to pitch in. I am very interested as nothing cuts it at the moment. I'd like to see the tool not only create suggestions for semantic baskets, but do page analysis for top rankers in a given query space for phrase ratios etc….

    I've been doing this the hard way up until now – hopefully we can get something rocking on that end of things.

  3. David, to your mind would WordTrackers Lateral Stemming Thesaurus Paid Tool qualify as a true semantic research tool? (no I don't market WT :) Start with “catering” and the results are not just stemmed synonyms like “food service.” The results are “weddings,' “events,” “delivery” and “box lunches?”

    Thanks for the informational post dude.

  4. The name alone gives me fits. lol. Stemming is stuff like plurals and a thesaurus is generally for synonyms. I searched high and low for something that could pick out related concepts or semantic building blocks like a search engines would. Once more with the jaguar example, the term 'fur' or 'short haired' (big cat) go towards defining the animal over the car or others. Much like a content piece targeting the 'White House' might have phrases like 'President of the United States' or 'President Obama' or 'Oval Office'. Those are terms/phrases that build out the concepts being related. On the face value 'white house', to a computer, could mean a general descriptive term for a home. It is the related phrases and semantic building blocks that give greater meaning to it. As such, there aren't really tools I've seen that do a good job of this.

    I find most tools tend to simply add mofifiers to the core term, instead of really coming up with supporting terms/concepts… Ya know?

  5. This is great! Thanks for sharing. This clear up some issues that I've been wondering concerning search semantics and keywords and how they work.

    1. hmmm…. let's no forget the relevance of the site/page of the links are also important. Meaning there is no bloody point in targeting the links here on SEJ bro. Just post yer name – comment links from an SEM site isn't really going to help all that much…. hehe

      1. theGypsy,
        Even though he might not get any link juice from it, he *may* get some traffic and after all that is what it is all about, traffic, not PageRank.

  6. I completely agree with everything you’ve said that’s stretching the SEO def a bit beyond it’s traditional scope. Thanks for sharing this.

  7. Its really very great post, you clear all the basic and advanced concepts. I have one questions about semantic.
    Like in title i use: Swimming, Teams, Skills, Classes.

    Are this title help in getting rank on Swimming team, Swimming Skills, Swimming Classes or not?

    Can you Please help me in this regards.

    Thanks

    1. To be honest, while we can target more than one term per page, I'd look to keep the concepts somewhat more targeted. Meaning I'd have a seperate page for 'Swimming Team' and 'Swimming Classes' as they are somewhat independant topics semantically. Now 'Swimming Skills' could certainly work well with 'Swimming Classes' but I'd make sure via KW research that 'Swimming Skills' is all that important a target (to me, I'd think not).

      The title itself is also a little dry. I'd rather see 'Improve your skills with our swimming classes' type approach. One does also need to consider the SERP CTR.

      Anyway, you want to try and sort semantically related concepts. In this case, I'd be thinking two pages, not one for all the terms. It is hard to say without knowing the site and market though. Each situation is truly unique.

  8. I recently touched on this subject on my blog, here is a quote:

    Near Topics
    Not only do you need supporting documents specifically about the subject but you should also post information about subjects “near” the topic.

    For instance if you are looking to achieve authority status about cars you should also write about trucks, motorcycles, optimal tire pressure at highway speeds, car crash investigation information, driving techniques, car seat position and it's effect on truck drivers lower backs. i.e. things about vehicles but not necessarily about cars specifically.

  9. article sympa, avec comme j'aime, des outils bien pratiques pour certains. Un ptit coucou à Stéphane arnoult que je viens de voir dans les commentaires.

  10. This is great! Thanks for sharing. This clear up some issues that I've been wondering concerning search semantics and keywords and how they work.

    1. I would keep in mind, once more, that this is a general overview for folks to better conceptualize things. There are so many variants of semantic analysis and related wieghting that we don't want the secret sauce, just a better understanding for content programs and KW generation. I hope it helps.

  11. Im new to seo so I have a question – does placing anchor text links hold any keyword rank value on search engines if its nofollow?
    and am I correct when I say that a dofollow only affects a sites ranking?

  12. Im new to seo so I have a question – does placing anchor text links hold any keyword rank value on search engines if its nofollow?
    and am I correct when I say that a dofollow only affects a sites ranking?

    1. Yes, Dofollow effect s a site ranking. If you submit your site in dofollow link site in 100 or more then obviously your site rank is increase. Nofollow effects a keyword ranking, its passed only page rank value.