Google’s latest acquisition is a company specializing in a technology for fighting spam and fraud on websites, reCAPTCHA. But aside from using reCAPTCHA to do what is is supposed to do, Google will also use the technology for its book scanning project, specifically in producing OCR images of these books.
reCAPTCHA’s technology will help in improving the OCR process of converting images from books that are being scanned into machine readable text. In addition reCAPTCHA would also allow Google to do larger-scale book scanning which is what its Google Book and Google News Archives need right now.
reCAPTCHA provides most of the words that we seen on websites that have the CAPTCHA technology running whenever we are required to sign in or register. Probably unknown to most of us is the fact that these CAPTCHAs actually come from the scanned books and newspapers.
And Google knows this pretty well, hence it it acquires reCAPTCHA with the hopes of improving the accessibility and availability of information contained in the books that are being scanned.