Archive for May, 2007

The new math of search valuation

Wednesday, May 30th, 2007

There’s been a bit of semantic confusion about the validated VortexDNA results.

What we have proven, and a third party has independently verified, is that a user is 14% more likely to click on a link with a high VortexDNA relevance score than on a link with a low one.

We made the mistake in our original press release of translating that directly into a 14% increase in clicks. That 14%, though, is relative to the clicks the search engine was already generating. In Google’s case, the math works out to a 3% increase in clickthroughs.

We’re sorry for any confusion caused by our original release. You can read the updated release here or just email us if you’re interested in learning more: kaila@vortexdna.com.

In the meantime, there’s been lots of conversation in the blogosphere about how to value the search market. Don Dodge values each percentage point of search market share at over $100 million in revenues and over $1 billion in market cap. Emre Sokullu builds further on Don’s math to come to the conclusion that Google makes $1 per month for every Internet user in the world.

Our release references Jonathan Thaw from Bloomberg, who indicates that a 10% increase in clickthroughs will be responsible for an additional 5% revenue growth for Yahoo. Using Thaw’s rationale, we extrapolated that a 14% increase in relevance would result in a 3% increase in clickthroughs for Google, or an additional $300 million in revenue.

What’s your perspective on the value of relevance?

Solve for semantics at the search engine level

Tuesday, May 29th, 2007

I’ve put up a few posts about the controversial semantic web or ‘Web 3.0′. Most people have a gut reaction that the concept is buzzword-heavy and lacking in practicality, or even a clear definition. Dr Riza C. Berkan summed up the issues today with intellectual rigor in a ReadWriteWeb post:

The two basic views of a semantic search are identified by the location of the semantic resources to be implanted. The first view is to embed the semantic resources in the Web pages themselves. It is called the “Semantic Web”. Why not compose Web pages in a structure that is semantics friendly?

…The “Semantic Web” approach has been around for a long time now. Unfortunately, it is based on an unrealistic assumption that every Web author will abide by the complex rules of semantics - not to mention the education it requires - and place content in the correct buckets of mysteriously unified standards. Another form of this approach may be to design Web factories that crank out refined Web pages once fed by ordinary Web pages. Of course if there is more than one factory, you have the standards issue again. In this day and age of fast content production, the Semantic Web seems to be more idealism than realism.

Dr Berkan goes on to discuss the pros of focusing efforts to understand the user at the search stage:

Without relying on statistics, long-tail queries can be analyzed by semantic algorithms on the fly, and bring search results with the accurate context… a semantic approach is very effective in handling dynamic content and can unleash its full power the second the content is born.

The argument, highly valid, is that it is easier to make one search engine intelligent than billions of web pages.

Dr Berkan’s company, hakia, offers a semantic search engine, as do Cognition Search and Lexxe. Powerset is working on theirs.

VortexDNA shares Dr Berkan’s view—in fact, we’re taking one step further away from the content. The idea behind MyWebDNA is not to create a new search engine, but a universal measure of relevance that can be overlaid onto any search engine.

Our tactics are different: the means of determining relevance can be through context, meaning, or, in our case, the purpose and values of the user. But our fundamental approach is the same: create the right lens, and the results will come into focus.

VortexDNA and MyWebDNA in the news

Monday, May 28th, 2007

Nearly every one of my posts references another piece of journalism, but it’s truly a delight to point you in the direction of an articles written about VortexDNA and MyWebDNA. YahooXtra News commented on our validated search results with an article entitled ‘NZ start-up sees itself as the visa of the net’:

VortexDNA has developed a self-profiling system that will allow search engines like Google, or social sites like MySpace, to predict your search and hugely increase your “click rate” on links that come up as a result of an internet search.

I’m excited to see the media pick up on the story, which I had commented on in a previous post. The validation of the search results takes us from the watery realm of the hypothesis onto the firm ground of the business case, and it took a lot of people (not me—I just write about it!) a lot of brains, creativity and persistence to get us there.

Google says the answers lie in the individual

Monday, May 28th, 2007

In an opinion piece published by the Financial Times on Friday, Peter Fleischer had this to say:

…the same words can have very different meanings to different people depending on their background and their interests. It is the same idea that is driving Google’s personal search service.

The bulk of his article focuses on the privacy issues of personalized search, which makes sense since Peter is actually Google’s Global Privacy Counsel. But I choose to focus on this perhaps deceptively obvious gem: that identical words can have totally different meanings depending on the user.

That conundrum lies at the core of the predictive search challenge: how can a search engine possibly hope to know what you’re looking for, if what you’re looking for can’t be decisively defined by words?

Google’s answer is, of course, history and demographics. VortexDNA’s approach is core purpose and values—first understand who you are, then use that as a filter. Are there other solutions out there? And do you have any opinions on what might be the best one?

Forget about keywords—focus on the individual for relevance

Friday, May 25th, 2007

In an article aptly titled The Mind Blowing Evolution of the Social Web, Solomon Rothman of WebProNews had this to say about personalization:

Web 3.0 will see all the social, user generated, and independent content conglomerated, analyzed and spit out in ways that can be quickly and efficiently customized to what’s important to you. Relevancy will no longer be determined at the keyword level, but on the individual level. Smart services will actually understand what you like and will evolve as do your likings and importance. It won’t be artificial intelligence yet, but it will have enough data from enough places to be able to quickly learn about your habits through smart “agents.”

I say that the article is aptly titled because I agree that the power of the web is nothing short of mind-blowing. And Solomon’s take on what constitutes personalization fits hand-in-glove with the VortexDNA view of the world, particularly that one sentence, which I think bears repeating:

Relevancy will no longer be determined at the keyword level, but on the individual level.

Later in the paragraph, he says that the web will have data to learn about your habits. MyWebDNA, of course, is taking a different approach, operating on the premise that we can get as good or better results from focusing on who you are, rather than on your habits. But we’re all going in the same direction here, following our hypotheses towards greater relevance.

This is the natural direction of the web. It can’t be more content—we’ve already got content coming out of our ears. It can’t be more participation—YouTube, Flickr and Wikipedia have effectively ensured that the web is now thoroughly powered by its community.

No, the direction of the web has to be towards understanding. Google’s mission is ‘to organize the world’s information and make it universally accessible and useful’. But we all know that ‘the world’s information’ isn’t necessarily useful to me. If I devote the rest of my life to absorbing information, I’m still only ever going to access a tiny subset of all the information that’s out there. So in order for Google to fulfil its mission of making information universally accessible, they have to make it individually relevant.

This, I suspect, is the driver behind Google’s personalization push. I’d love to know what you think.

What is a long tail exactly and why should we chase one?

Wednesday, May 23rd, 2007

I’ve been hearing the phrase ‘chasing the long tail’ bandied about. It sounded sleek and dangerous to me, like something that should be happening on safari in Africa. But then Eric Enge at Search Engine Watch wrote a whole piece about the long tail that included this:

At Google’s Universal Search announcement, Udi Manber put up a slide that stated that 20% to 25% of the search queries Google sees every day are search queries it has never seen before. Let that sink in for a moment. To me, that number was startlingly large.

I found that comment so interesting that it made me want to understand the ‘long tail’ concept. Naturally, I Googled it, and found that the phrase has its origins in a book by Wired editor-in-chief Chris Andersen. Essentially, it means that the direction of our economy is AWAY from a few products with millions of consumers each and TOWARDS millions of products with a few consumers each:

The Long Tail

Now, the reason I found Eric’s factoid so interesting is that VortexDNA, like so many others, is in the predictive search arena, aiming to improve the relevance of search results with MyWebDNA. One area in which MyWebDNA really excels, though, is in its ability to increase relevance precisely for searches that haven’t been done before, by basing the matches on the core purpose and values of the user rather than the user’s search history.

If 20 to 25% of Google search queries are new ones, MyWebDNA could be extraordinarily useful. I’d be keen to know what you think.