Smartphones put the world at our fingertips. People have questions that need answering, as well as the services or products they need.

All of these things are just a search away, and now, we’ve seen a cosmic shift from traditional search to voice search and voice assistants.

Statistically, voice search and assistants are not something that enterprise marketers can ignore because:

Almost half of U.S. internet users (48.7%) will use voice assistants, according to eMarketer forecast.

54% of consumers are leaning towards voice technology in the future.

49% of U.S. consumers use voice-enabled searches for local services.

Voice optimization at scale is what every business should be doing. For enterprises, the challenge is scale due to the wealth of content assets they control.

In this guide, we’ll take a look at specific tactics and optimizations that will support your voice strategy, including schema markup, keyword research, site speed, FAQs, Google Actions, and more.

Here’s how to begin optimizing for voice searches, with a focus on enterprises.

Voice Optimization 101

Create Content And Voice Search Guidelines

Marketing teams should sit down with the content team or send guidelines outlining the importance of voice search optimization, incorporating these keywords and protocols to ensure optimization.

Enterprises should have SEO governance in place already.

However, you’ll need to revise your existing governance and protocols for voice search. In fact, you want to add entire sections that focus primarily on voice.

Why?

Content creators and teams are bound to make mistakes.

It’s up to your protocols to find issues with content by performing thorough content checks.

Analyzing content before it’s published should be part of your processes already.

If it’s not, you can add in:

Thorough content review before posting.

Optimization analysis.

Comparing content to researched keywords and questions.

Guidelines are a key part of every aspect of enterprise marketing because team members can come and go so often.

Redefine Your Keyword Research To Incorporate Long-Tail Keywords

Here’s some good news: Assistants are smarter than ever before. Today’s voice assistants can understand a person’s voice even with:

Background noise.

Diverse accents.

Dialects.

Hyper-personalization is prominent in the way assistants respond to users, which means enterprises must gather as much data and information about their ideal target market as possible.

You have to go the extra mile to understand your audience and their needs to optimize for voice.

For voice assistants, you have to push your SEO further because, instead of simple queries, people are asking complex questions to voice assistants like they would to a friend.

How?

Adding in more of the long-tail keywords that have long been neglected on the enterprise level.

Long-tail keywords often have lower search volumes and are less of a priority for enterprises that target high-value and high-traffic keywords. However, voice search is natural and longer than just one- or two-word phrases.

Your pages need to answer questions (just like featured snippets do) and should include:

How do I use XYZ product?

How much do XYZ products cost?

How do I fix XYZ problem?

Where.

Who.

What.

Etc.

People using search are asking questions, and you need to answer them. Redefine your keyword research process to include more long-tail keywords and question keywords.

Create processes and procedures for SEO teams – internal and external – to incorporate questions into your current content creation process.

Multimodal Search Optimization And The Rise Of Visual Search

Visual search isn’t exactly new. You take a picture, pop it into Google Lens, and it tries to find a match for you.

For example, that adorable dog bed that you saw at your friend’s house? You can take a photo and search for the exact item on Google.

But, at the I/O developer conference in 2024, Google added something new to Google Lens.

You shoot a video.

Ask questions in the video.

Get an answer back.

Users can take a video of their broken toilet and ask why the flange is stuck and what they need to fix it – all in video format. Google will now analyze the video and respond to you.

Vision language models (VLMs) are advancing, but enterprises will need to focus on other multimodal searches, too:

Text-to-image search.

Image-to-text search.

Image-to-image search.

Envision an enterprise for high-end luxury apparel.

A user uploads an image of a floral pattern and adds the query, [floral dress in this style but with blue roses], and a return query may show your product.

Clear visuals with the proper description optimization may help the enterprise rank for this type of multimodal visual search.

Optimize For Site Speed And Mobile Experience

Voice searches come primarily from mobile and assistant devices.

Every enterprise must optimize heavily for mobile with:

Responsive designs.

Fast site speeds.

Your team should periodically run Google PageSpeed Insights to find issues slowing down your site and to improve load time.

Multimedia optimization is crucial, especially with the rise of multimodal search. Compressing images and videos, implementing lazy loading, and browser caching are all things that you can begin doing today to improve the mobile experience on your site(s).

See 10 Enterprise Page Speed Optimizations & Implementation Tips to learn more.

Optimize For Local Search To Boost Business

Local and regional optimizations are huge for businesses that operate locally.

Over 50% of people search for local businesses via voice search.

For example:

Where is the nearest Subway near me?

What grocery stores are open nearby?

Where is the closest pharmacy?

You’ll want to review the enterprise’s Google and other local listings.

Listings should always include the company’s operating hours, short blurbs, and photos.

Complete listings make it easier for searchers to reach out to your business or visit it in person.

Terms may include “near me” phrases, or they can be specific, such as [car manufacturers in Detroit].

One tip crucial to an enterprise’s success when optimizing for local is to account for regional or area slang.

Your research teams should understand local slang and dialects that may be used in a search.

For example, [where can I get the best soda in Boston] will change to [where can I get the best pop in Ohio] due to regional slang.

Internal teams should help you create these distinctions before moving into new markets to help content creation and search engine optimization teams maximize local voice search potential.

In the last few years, voice assistants have nearly doubled. From your iPhone and Android to Alexa and other platforms, assistants are everywhere.

Personal preferences are taken into consideration, as well as your location across all three types of searches:

Discovery: Find a plumber in Atlanta, Georgia. Direct: Call Bill’s 24/7 Plumbing and Septic. Knowledge: Why is my water turning brown in Atlanta?

Conversational phrasing must be considered across all enterprise offices to help capture as much local search traffic as possible.

Enterprises must do more for voice searches than just claim and optimize their listings on Google Business Profile, Apple Business Connect, Yelp, and other local directories.

You need to focus on long-tail keywords, refine your keyword research even further, and try to add context to your content.

Master Schema Markup To Add Content Context

Leveraging schema is crucial to help search engines make sense of an enterprise’s site content. Review and incorporate schema markup guidelines to help boost voice search.

A few tips that can help you master schema are:

Start using Google’s Speakable Schema (beta) for sections of your text that are best for Google Assistant and voice search.

(beta) for sections of your text that are best for Google Assistant and voice search. Use analytics to help understand keywords and phrases customers are using.

and phrases customers are using. Find speakable snippets in new and old content to add schema.

in new and old content to add schema. Think of your content in a conversational way to enhance context.

Schema markup, when used properly, can help add context to the content on each site and allow for greater voice search potential.

Speakable Schema lets you fine-tune your control over how voice assistants highlight your content. For example:

{ "@context": "https://schema.org/", "@type": "WebPage", "name": "Ludwig’s homepage", "Speakable": { "@type": "SpeakableSpecification", "cssSelector": ["intro", "summary"] }, "url": "http://www.example.com" }

Using JSON, you can add the speakable structured data to make your intro and summary speakable. You can adjust this for any cssSelector you like.

Enterprises are also finding greater success with voice when adding structured data for:

Product information.

Pricing.

Availability.

As an enterprise, a bump of 1% to 2% traffic from search can add significant revenue to your bottom line. Schema.org has examples of how to use schema for ecommerce using microdata, RFDa, structure, and JSON-LD.

Add FAQ Sections Into Key Pages

Remember how you need to add questions to your keyword research?

It can be challenging to find ways to add questions to pages without interrupting the natural flow of your content.

How can you overcome this? Frequently asked questions.

FAQs can add immense value to your pages and help you start improving your voice search optimization.

One way to begin incorporating this is to:

Perform a full content audit on the site(s).

Identify pages and blogs where you can answer questions.

Start adding FAQs to the most important pages and pages with the most potential.

Since you’re optimizing for voice search, answering questions in conversational tones is crucial.

Begin The Transition To Conversational Language

Content creators have heard about tone and consistency for decades.

“Speak the customer’s language” is often repeated across industries.

However, when dealing with voice search, a shift toward a conversational tone is emerging.

As it turns out, the stuffy “business tone” isn’t how most people use their Google Assistant or Amazon Echo.

You’ll need to ensure content teams are on board with these changes.

A quick meeting to reinforce conversational tones and maybe an update to briefs sent to writers can help drastically.

An excellent way to adjust content to be conversational is to have:

Editors review all content.

Read content aloud.

Small changes, and if you can add in spoken words and slang, can make a world of difference when trying to create more conversational content.

While there will always be traditional typed searches, enterprises and marketers should focus on the possibilities that voice search has to offer.

