Google, Neven Vision & Image Recognition

SEJ STAFF Loren Baker

August 15, 2006
⋅
13 min read

SEJ STAFF Loren Baker Founder at Foundation Digital

Bio

40K

READS

Google, Neven Vision & Image Recognition

Google today acquired a key player in face and image recognition biometrics, Neven Vision. The benefits and new features which Google can roll out with Neven Vision under its belt are seemingly endless and if I were to compare the opportunity of Google’s new buy with any other recent acquisition in Google history, it would be that of Keyhole (ie. Google Earth & Google Maps).

Adrian Graham, Picasa Product Manager, made the Neven Vision acquisition announcement on the Google Blog : Neven Vision comes to Google with deep technology and expertise around automatically extracting information from a photo. It could be as simple as detecting whether or not a photo contains a person, or, one day, as complex as recognizing people, places, and objects. This technology just may make it a lot easier for you to organize and find the photos you care about. We don’t have any specific features to show off today, but we’re looking forward to having more to share with you soon.

Sure; image, face, place and object recognition can change the entire Google Picasa photo organization experience and with automatic tagging, may even become a serious competitor to Yahoo’s Flickr or Yahoo Photos.

There is, however, so much more to this acquisition beyond the photo album tagging and photo definition technology.

Patents, Video & Mobile Technology

First let’s look at the patents which Neven Vision holds. The NevenVison.com site has been taken down and is serving Google messages, but a glimpse at its pre-existing pages in Google Cache reward the reader with some nice background information on the company, its patents:

Patent Number & Description

* EP1072018 : Wavelet-Based Facial Motion Capture for Avatar Animation

* 1072014 Face Recognition from Video Images

* EP1072018 Wavelet-Based Facial Motion Capture for Avatar Animation

* 218457 Face Recognition from Video Images

* 218458 Wavelet-Based Facial Motion Capture for Avatar Animation

* EP1072018 Wavelet-Based Facial Motion Capture for Avatar Animation

* 1072014 Face Recognition from Video Images

* 6714661 Method & System for Customizing Facial Feature Tracking Using Precise Landmark

* 6222939 Labeled Bunch Graphs for Image Analysis (EYEM1160/ NE01)

* 6356659 Labeled Bunch Graphs for Image Analysis

* 6563950 Labeled Bunch Graphs for Image Analysis

* 6466695 Procedure for Automatic Analysis of Images & Image Sequences Based on Two Dimensional Shape Primitives

* 6272231 Wavelet-Based Facial Motion Capture for Avatar Animation

* 6580811 Wavelet-Based Facial Motion Capture for Avatar Animation

* 6301370 Face Recognition from Video Images

As we can see from the patent list, Neven Vision is not limited to facial recognition nor image recognition through images alone and holds a nice selection of video recognition patents, specifically centered around Face Recognition from Video Images and applying this technology to the mobile phone.

Just as I thought I had some kind of patent scoop on the company, I clicked over to SEO by the SEA and saw that Bill Slawski has already listed these patents and excerpts from their abstracts. So, instead of going down the same course, here are some interesting patent applications and abstracts highlighted by Mr. Slawski:

Image base inquiry system for search engines for mobile telephones with integrated camera
Invented by Hartmut Neven, Sr.
US Patent Application 20050185060

Published August 25, 2005
Filed February 20, 2004
Abstract

An increasing number of mobile telephones and computers are being equipped with a camera. Thus, instead of simple text strings, it is also possible to send images as queries to search engines or databases. Moreover, advances in image recognition allow a greater degree of automated recognition of objects, strings of letters, or symbols in digital images. This makes it possible to convert the graphical information into a symbolic format, for example, plain text, in order to then access information about the object shown.

Image-based search engine for mobile phones with camera

Invented by Hartmut Neven, Sr. and Hartmut Neven

US Patent Application 20060012677
Published January 19, 2006
Filed May 13, 2005
Abstract

An image-based information retrieval system is disclosed that includes a mobile telephone and a remote server. The mobile telephone has a built-in camera and a communication link for transmitting an image from the built-in camera to the remote server. The remote server has an optical character recognition engine for generating a first confidence value based on an image from the mobile telephone, an object recognition engine for generating a second confidence value based on an image from the mobile telephone, a face recognition engine for generating a third confidence value based on an image from the mobile telephone, and an integrator module for receiving the first, second, and third confidence values and generating a recognition output.

Single image based multi-biometric system and method
Invented by Hartwig Adam, Hartmut Neven, and Johannes B. Steffens

US Patent Application 20060050933
Published March 9, 2006
Filed June 21, 2005
Abstract

This disclosure describes methods to integrate face, skin and iris recognition to provide a biometric system with unprecedented level of accuracy for identifying individuals. A compelling feature of this approach is that it only requires a single digital image depicting a human face as source data.

Besides patents & patent applications, perhaps a better way to look into what Google has brought under its wing is to look at the tangible products, businesses and clients using Neven Vision’s Technology.

First is iScout, a mobile marketing oriented product which allows users to capture an image with their cell phone camera and then send that image to iScout to receive coupons, enter contests or sweepstakes or engage in mobile transactions (from the April 6th press release):

For Marketers, iScout can be used for image based mobile marketing campaigns (logos, product packaging, magazine and newspaper ads, posters, billboards, etc.). Using iScout, not only can mobile marketing campaigns display rich multi-media based brand specific content (image, text, audio, and soon video), but the content can also be changed dynamically with subscriber interaction. Subscribers can interact with real world images like magazine or newspaper ads, posters or retail items, and any other commercial image they see by capturing pictures with their camera phones and sending them in to get marketing offers, enter contests, engage in transactions and search for information. The mobile user experience can be dynamically customized based on interaction and intent, creating a unique, content rich user experience.

Neven Vision conducted an intensive search to identify a mobile solution to meet all of the requirements for iScout. During this search, Neven Vision discovered the SmartPathTM Mobile Publishing Solution by Trilibis Mobile. SmartPath enables clients to easily create and manage highly-customizable rich content mobile applications to run across all major platforms, networks and devices. Using SmartPath, Neven Vision has introduced a number of enhancements to iScout, including a more intuitive user interface, the ability to save download files and a multi-lingual menu that can be localized by country.

“Mobile marketing is poised for hyper-growth, with Asian and European markets already generating significant data service revenues for mobile operators,” said Alex Cory, Neven Vision CEO. “Mobile visual search brings a critical element to this market, just as paid search did for the internet. It helps advertisers and brands move from low performance impression based ads to more interactive, real-time approaches to engaging consumers, offering a great opportunity to build a real connection at the consumer’s moment of interest and creating a convenient path for direct response.”

Seems like Google AdWords on buses, billboards and other forms of outdoor advertising is not just a joke among Madison Avenue players anymore.

Imagine Google AdWords technology monitoring the profiles of users and the paths they generally take to the workplace, favorite lunchtime eatery, or commute home; and then optimizing the outdoor or subway advertising to show the most relevant ad to the most relevant group at the most relevant time.

Take a photo of that advertisement with your mobile phone and enter to win a contest, subscribe to a magazine or purchase that last minute gift for your spouse using Mobile Google Checkout. This is the (speculated) reality which Neven Vision brings to the Google Plex.

Better yet, what if there is not an advertisement to photograph? Neven Vision iScout can be used to simply take an image of an object, recognize that image and then possibly serve comparative shopping information via Froogle or normal Web Search or Image Search results.

Example, I’m walking through the woods and notice a tick has crawled up my leg and bitten me. I can use my phone (or wi-fi camera for that matter once wi-fi is widespread) to snap a picture of that rascally arachnid, its marks and its coloring to see if that tick is carrying Lymes Disease. The same method could be used to identify poisonous snakes or spiders and an antidote or treatment and time constraints for finding nearby hospitals. The possibilities are endless.

Facial Recognition on the Mobile

Neven Vision’s bread & butter product seems to be their fast mobile facial recognition which is licensed specially to law enforcement agencies. A 2005 issue of USA Today covers their technology for this niche market:

Neven Vision stands out as the only facial-recognition engine on the market that can run directly on handheld devices such as personal digital assistants, wireless phones and mobile terminals.

The company will use this capability in a new product called Mobile Identifier due out in the coming months. Geared toward law enforcement, Mobile Identifier features an on-board database and image-recognition engine. Results are immediate because officers don’t need to wait for data to be transmitted via wireless connections.

According to Liz Gannes at GigaOM, this Neven Vision mobile technology is currently being used by the LAPD in order to identify gang members.

Question is, will Google bring mobile face recognition to its consumer marketplace? Will one be able to take a picture of someone at the bar, search on Google for him or her, then be able to read that person’s blog, MySpace profile or home address? Chances are doing so would conjure up all kinds of privacy controversies, but the technology to do so is right around the corner (more on this below).

In terns of Mobile Face Recognition, what we’ll possibly see is Google taking advantage of this personal information more so than the end user.

For example, I take a photo of a cute girl, or group of people at a bar. And what does Google now know about these people?

* Which bars or restaurants they frequent?
* Where is the location of that bar?
* Whether they prefer beer, wine or liquor.
* What kind of clothing they purchase & wear?
* Whether or not they smoke.
* Do they wear eye glasses or not?
* Who are they connected to in the real world and what is that connection?
* Which college did they attend?
* What sports teams do they follow?
* How much makeup do they wear when they go out?
* Whether they use tanning lotion or not.
* Do they have a full head of hair, balding or bald?

Cross reference this image recognition information with what Google already knows of its registered and unregistered users and now we have oodles and oodles of marketing information at our finger tips.

Google Building Facial Recognition Database?

The obvious next question in the equation is how would Google or Neven Vision know who we are in the first place? Well, if you have a criminal record or history of gang association, or if your state openly sells their Department of Motor Vehicles license information, you’re probably in the Neven Vision database already.

But, how else can Google obtain what our faces look like? One of the reasons Google targeted (or reason for the rumors) a startup called Riya was for its tagging ability and the ability to mass-tag photos based upon the person in them, location and so on.

There has been a recent theory however that Google is building a facial feature database of Google Accounts owners on their own. Ionut Alex at the Google Operating System Blog has a theory that when a GMail or GTalk user uploads their photo or avatar, Google is associating face photo crops with that person, their search and web history, and their account:

Gmail allows you to add pictures for your contacts. If you upload a picture, Gmail will ask you to crop the picture, to separate the face of the person. So Gmail has a database of multiple images for a lot of persons….It’s a very easy way to obtain a database of faces useful for face recognition. Algorithms for detecting and recognizing faces are good, but not good enough, and this is a great way for Google to improve their AI algorithms using the data obtained from its users.

Another use for an image database used for facial recognition? Sunday evening we posted a story on Google Video and Google Personalization describing the whitepaper on using Social Network information for search engine (and advertising) personalization, here’s a reminder:

My friend Bill Slawski brought this up during his presentation on Algorithms at Search Engine Strategies. Bill connected the whitepaper, InterestMap: Harvesting Social Network Profiles for Recommendations, to the latest Google & MySpace partnership.

The abstract of the paper is as follows:

“While most recommender systems continue to gather detailed models of their “users” within their particular application domain, they are, for the most part, oblivious to the larger context of the lives of their users outside of the application. What are they passionate about as individuals, and how do they identify themselves culturally? As recommender systems become more central to people’s lives, we must start modeling the person, rather than the user.

In this paper, we explore how we can build models of people outside of narrow application domains, by capturing the traces they leave on the Web, and inferring their everyday interests from this. In particular, for this work, we harvested 100,000 social network profiles, in which people describe themselves using a rich vocabulary of their passions and interests. By automatically analyzing patterns of correlation between various interests and cultural identities (e.g. “Raver,” “Dog Lover,” “Intellectual”), we built InterestMap, a network-style view of the space of interconnecting interests and identities. Through evaluation and discussion, we suggest that recommendations made in this network space are not only accurate, but also highly visually intelligible – each lone interest contextualized by the larger cultural milieu of the network in which it rests.”

Bill’s example of how this information gathered from Social Networks can be used is if a user:

* Lists their favorite musicians in their MySpace account as John Lee Hooker, Muddy Waters and Bessie Smith
* They search on Google for “blues”
* Google can now serve personalized search results with all blues music oriented serps.
* If another user is a member of the Brett Hull Fan Club and lists Hockey as a hobby on their profile while living in St. Louis; Google can serve personalized results focused on The Blues Hockey Team.
* Of course, such personalization can be extended to Google Video Advertisements.

So, in an effort to end this review of the Google Neven Vision acquisition, my thoughts and the possibilities of applying the N.V. technology to Google’s existing efforts, I’d like to add that if social networking is the future of online communication and Google is going to use Social Network profile information to personalize their search engines, but does not have a way of identifying the owner of the profile on some social networks due to no advertising in the network or no Google partnerships, what is one object or identifer that most users or members of Social Networks, Blogs, and Forums post that Google can use Neven Vision to help define that user? Their photo.

Category SEO

How To: Optimize Your Small Business For AI-Powered Search

Modern Local SEO & AI Visibility: How To Get Clients Into AI Results

Inside AI Max, PMax & Smart Bidding: How PPC Managers Regain Control

Earn AI Citations: What Your Content Needs To Look Like [A 4-Article Playbook]

Do Searchers Actually See Your Brand? What SERP Pixel Data Reveals

Fix Your KPI Blind Spots: How To Finally Tie AI Search To Performance

Google, Neven Vision & Image Recognition