http://blogs.wsj.com/digits/2010/07/30/analyzing-what-you-have-typed/

July 30, 2010

Analyzing What You Have Typed

By WSJ Staff

In a study of surveillance technology deployed by companies on the Internet, the Journal found that marketing-technology firm Lotame Solutions captures in real time what people are typing on a site and analyzes it.

Behind that analysis are years of computer-science research into artificial intelligence and the understanding of human language -- now being mobilized in the service of tracking consumers on the Internet. After Lotame collects a Web user's words, it sends the text to a U.K. company called OpenAmplify. OpenAmplify's software reads the content and determines what the writer was saying -- what topics are being discussed, how the author feels about those topics, and what the person is going to do about them.

It's a process that seems natural for humans, but for computers -- not so much.

Take the following sentence: The man ate a fish on a bicycle. "We know from logic derived as a human that it's the man who must be on the bike and eating a fish," said OpenAmplify CEO Mark Redgrave. "But from a pure linguistic analysis point of view, that is really tricky."

This kind of problem is one that OpenAmplify's 25 researchers have been tackling for the past decade.

Hapax Limited, the company behind OpenAmplify, has been doing such "natural language processing" since 2000, Mr. Redgrave said. In 2002 it began "buzz monitoring" -- analyzing news articles for big brands that wanted to keep track of what was being said about them. Two years ago, the company launched OpenAmplify, which allows customers to send any text through a Web service to be analyzed. Mr. Redgrave said the service now analyzes text "hundreds of millions of times per day."

"We can understand the specific things [Web users] are talking about. We can understand exactly how they feel about those things," he said.

And what about the privacy of those feelings?

"We carry no private data through our platform," he said. "The content we analyze is completely anonymous. It's up to the person submitting the text to ... associate the text with a particular user on their end if they so wish."

And as the growth of online communication continues apace, the opportunities for natural-language processing and analysis are only increasing.

"Social media is an amazing opportunity," Mr. Redgrave said. "For the first time in marketing history we have hundreds of millions of people online telling us what they like, what they hate and what they're going to do before they do it ... That's extremely valuable data."

-- Julia Angwin and Jennifer Valentino-DeVries