Archive for March, 2007

Relevance results in raised revenues, but history could be history

Thursday, March 29th, 2007

Just three days ago, David Kaplan wrote about the incredible growth in Yahoo’s click-through rates since implementing Panama, its new ad relevance system.

Click-through rates on Yahoo ads rose 5 percent in the week after Panama’s debut on Feb. 5 from the week before, according to ComScore. The rate rose 9 percent the following week.

An increase in click-through rates approaching double digits means big money for Internet search giants these days. According to a Bloomberg survey of analysts referred to in an article by Jonathan Thaw, Yahoo’s looking to post net sales of $5.47 billion this year, and the jump in click-throughs could translate to 20% growth in revenue from searches in the second half of the year.

A similar jump for Google would have even greater impact: the same survey is suggesting the search behemoth may post net sales of $12 billion, more than twice that of Yahoo. My phenomenal powers of intuition tell me this may be a key motivator behind Google’s push towards personalization.

Panama is obviously showing its effectiveness, particularly compared to Yahoo’s old system. But the software, like Google’s personalization technology, still relies on external indicators to gauge relevance. Thaw explains it:

Panama makes ads more relevant to search queries and more likely to be clicked on. It does this by taking into account how many times ads are clicked — a measure of their usefulness –as well as the price companies bid for their spot on the search screen. Previously, Yahoo ranked ads based solely on the price bids.

Yahoo’s looking at history and demand. Google’s looking at history, context, and demographic. Neither of them is getting to the heart of the matter—which is to say, neither of them is getting to the heart of the user.

History-based relevance relies on you having visited multiple times already. You have to have searched for beetle 10 times and clicked on the car 10 times for it to understand you don’t want the insect. Given the range of topics you may be searching on, that’s a massive amount of information to have to build in order to make even half of your searches relevant.

History’s effectiveness lies in its familiarity. You arrive someplace you recognize; it feels comfortable.

Imagine, though, if you arrived someplace new, and felt like you were already at home. That’s the effect you get from a technology that understands who you are at your core.

By seeking to understand your inner drivers, your purpose and values, VortexDNA doesn’t rely on the previous sites you’ve visited. Instead, it suggests those sites that are most aligned with the mathematical expression of who you really are.

Right now, MyWebDNA is proving to improve the relevance of Google search results. As Thaw demonstrated, even a 2% increase would translate to hundreds of millions of dollars for the search giant.

Google, don’t you find that kind of relevance… relevant?

Don’t take it personally, Google

Wednesday, March 28th, 2007

The hot buzz on the online street these days is about Google’s personalized search, announced last month by Sep Kamvar, Google’s Engineering Lead for Personalization:

… over time, as the search engine learns your preferences, you’ll see it. For example, I (Sep) am an avid Miami Dolphins fan (no joke). Searching for [dolphins] gives me info about my favorite football team, while a marine biologist colleague gets more information about her salt-water friends.

Google’s announcement served as a catalyst for unleashing some deep-rooted emotions about personalization. Gord Hotchkiss wrote a terrifically well-balanced article weighing the pros and cons, which provoked some serious comments from his readers:

I already hate it when Google pushes me to German Google due to IP geolocation. I prefer to see English Google.com, as I chose to enter “google.com” in the browser… I don’t want Google to be stuck in my town, in my country, in my past, or in my belief system. Because when I use a search engine, then I want to precisely expand my horizon, not be limited to it. I precisely *want* to learn about when a word is amiguous in other cultures, to better understand other cultures. I *want* to *accidentally* stumble upon new communities or unknown zones of the web.

We’re a funny bunch, we humans. If I return to a previously visited hotel, and the person behind the desk says, “Welcome back, Ms Colbin,” I feel like royalty. Never mind that an invisible computer screen is flashing, “SHE HAS BEEN HERE BEFORE - SAY ‘WELCOME BACK, MS COLBIN.’” But, like Gord’s reader, I get touchy about ‘automated’ personalization. At the heart of my concern is something that holds true for people as much as machines:

I don’t like assumptions made about me.

I also don’t like to be told what I don’t like. Or like. Particularly if the statement is incorrect, or based on faulty logic. “Oh, you’re from New York, so you must be into fashion.” What?

People also tend to get touchy about the lack of transparency of the personalization, with lots of blog posts pointing out the difficulty of seeing whether you’re signed in to Google. As much as we like to be recognized and understood, we don’t like to be out of control when it comes to our decisions.

In Gord’s post, he describes the three ways search is being personalized: history, context, or demographic. The issue with these three tactics is that they all run a serious risk of faulty logic, the type that shows up all the time if you’ve ever bought a gift on Amazon. That one purchase will haunt your ‘personalized’ recommendations for the rest of your days.

VortexDNA’s approach can be considered a fourth tactic: attempt to understand, without making assumptions, the true factors that drive your behavior. Distill your core purpose and values into a mathematical algorithm that has been proven to translate to more relevant search results.

Gord’s reader will be happier, because the algorithm understands his deep desire to explore new worlds, and doesn’t limit search results based on his geographic location.

MyWebDNA demonstrates the technology by circling the two results on a Google search page most aligned with who you really are. It doesn’t change what shows up on the page—it merely points out the results you’re more likely to care about. So here’s my challenge to Google:

Set up a beta search page that produces two columns of results. On the left side, show results from regular Google search. On the right, show the results from Google search enhanced by VortexDNA. Then let the people decide. If, as Sep says, Google’s goal is to give you exactly the information you want when you want it, this is a great way to get there.

Personally speaking, I think it will be a big step in the right direction.

Web 3.0 searches for meaning

Saturday, March 17th, 2007

When I first started writing about Web 3.0 not so very long ago, I was surprised by the amount of ire people had towards the phrase ‘Web 3.0′. Tired of hype, sick of overblown promises of the Next Greatest Thing, faithful Web 2.0 users patrolled the Net, ridiculing or removing premature references to yet another sequel.

As Thomas Claburn reported in Information Week this past Thursday, Web 3.0 has won its most recent battle:

Up until last month, Web 3.0’s future was in doubt. Wikipedians were divided about the legitimacy of the concept, and those skeptical of the term deleted the Web 3.0 entry from the online encyclopedia five separate times during 2006. After this series of near-death experiences, the article was put under protection last October…

In February, a deletion review for the entry concluded, with the majority of Wikipedia contributors voting to accept the legitimacy of the term.

The watershed moment may well have come last November when New York Times reporter John Markoff legitimized Web 3.0 in an article that described the term as a movement to add meaning and structure to the vast amount of information on the Web.

Some may say that this argument is about semantics, not substance. Substance is created from semantics, though. Pulitzer Prize winner William Safire is certainly skilled and knowledgeable enough to write on just about any topic; he has chosen to spend the past 28 years writing the “On Language” column in the New York Times Sunday Magazine. I suspect that he better than anyone understands that the person who defines the words is the one who holds the cards.

I did look it up to see what Merriam-Webster had to say:

se·man·tic (si-’man-tik) adjective of or relating to meaning in language

What could be more substantial than meaning?

In the case of Web 3.0, people are irritated by the sheer cheekiness of it, coupled with the lack of a clear, tangible definition that everyone can grasp. Among the many positive responses to Stephen Baker’s post about Web 3.0 last October were some that were positively vitriolic, such as:

If people don’t stop using this useless marketing term: “Web 2.0″, let alone plugging a “newer, improved” Web 3.0 term, then the world will probably implode.

and

Don’t you think we could finish wrapping our heads around and implementing web 2.0, learning about is standards and methods before you go on jibbering about web 3.0?

What are the other options, though, to refer to web apps that successfully integrate relationships, allowing, as Claburn pointed out, the possibility that

someone querying a Star Wars database that supports semantic protocols could search for “Darth Vader’s son’s sister” and would find documents relating to Princess Leia, despite the absence of that specific phrase in any of the found documents?

The only one that seems to have gained any traction is Tim Berners-Lee’s ‘Semantic Web’, which may take off because it sounds intelligent. The issue with that particular turn of phrase is that some people don’t know what it means, and others—the ones who like to say, “Now you’re just arguing semantics,”—imbue it with a negative connotation.

Won’t you help us, Mr Safire?

Freebase connects, but don’t forget the users

Wednesday, March 14th, 2007

Last week, John Markoff had this to say in The New York Times:

A new company founded by a longtime technologist [Danny Hillis] is setting out to create a vast public database intended to be read by computers rather than people, paving the way for a more automated Internet in which machines will routinely share information…

On the Web, there are few rules governing how information should be organized. But in the Metaweb database, to be named Freebase, information will be structured to make it possible for software programs to discern relationships and even meaning.

For example, an entry for California’s governor, Arnold Schwarzenegger, would be entered as a topic that would include a variety of attributes or “views” describing him as an actor, athlete and politician — listing them in a highly structured way in the database.

A highly interesting concept, indeed. Certainly interesting enough to be picked up by Web guru Tim O’Reilly, who commented:

If Metaweb gets this right, this bottom up approach will build new connections between data, new categories and ways of thinking. It will likely be messy and contradictory for a while, but … they are building new synapses for the global brain.

One of the beauties of the Freebase proposition is its simple complexity. By allowing computers to do all the back-filling, the database grows exponentially faster. I don’t have to go to the New York page and say I’m from there and then to the New Zealand page and say I live there. I go to my page, say I’m from New York and live in New Zealand, and, bada bing bada boom, all three pages get updated.

The other, more important, beauty is Freebase’s recognition that relationships are as important, if not more so, as the things themselves. We as humans can only hope to understand the world through relationships. Think about how you describe, well, ANYTHING: “She’s Joe’s ex-girlfriend.” “The theater is two blocks west of State Street.” “This new database is better than the old ones.”

Relationships, always relationships. Even when we think we’re perceiving things independently, we’re comparing. “That person is skinny.” Relative to whom? “The movie was great.” In whose opinion?

If a tree falls… my friends, in the polarized field of our existence, we can only be in relation to something else.

So, yes, I applaud Freebase and Danny Hillis. As with Wikia’s search effort, though, Hillis remains focused on using the user in order to create the best possible database. The focus is on the database.

But the user is the reason for our work. The user is who benefits from the database. The user is at the center of the semantic web.

The reason the focus is on the database is because, as complex as it is to create, it’s still easier than understanding what drives a human being, which is, of course, the unique value proposition of VortexDNA.

I’m sure Danny Hillis and Jimbo Wales and Tim O’Reilly, none of whom is any slouch in the intellectual department, can see the possibilities of adapting search to the user. Between all of us, we’re nibbling at the edges of the semantic Web, nosing at the door. In the warped space-time continuum that is Internet development, it will be mere moments before our visions are reality.

Let’s just make sure we bring the user with us.

Wikia Search and the value of values

Saturday, March 10th, 2007

The beauty of Wikipedia is that it capitalises on a billion small efforts to make a huge, spectacular resource. And now Wikia, the company co-founded by Jimmy Wales (who created Wikipedia) is looking to apply the same principles to search. According to Jonathan Thaw of Bloomberg News:

Wikia Inc., the San Mateo company co-founded by Wikipedia creator Jimmy Wales, plans to challenge Google Inc. and Yahoo Inc. with a search engine that lets users edit and fine-tune its results…

By enlisting programmers and users around the world, Wikia is taking a different approach than Mountain View-based Google and Sunnyvale-based Yahoo, owners of the two most-popular search engines, which keep much of their software code secret…

“We think it’s the sort of thing that shouldn’t be controlled by one company or one group of companies,” [Gil Penchina, chief executive officer of Wikia] said.

Wikia users will collaborate to build an index of Web sites that anyone can edit. They also will be able to fix search results if they don’t give useful information, he said.

After I read that intriguing bit of news, I scurried to the Wikia site to find out more. Jimbo Wales had this to say:

Search is part of the fundamental infrastructure of the Internet. And, it is currently broken… I am looking for… community members who would like to help build people-powered search results…

So what is ‘people-powered’ exactly? I remember when I first heard about Wikipedia: it didn’t make any sense. “You mean anyone can make any change at any time? How come it doesn’t end up as gibberish?” It was patiently explained to me, as I in turn explained to others, that the people who care most about a thing also tend to be the people with the most knowledge, as well as the people with the most passion for making sure the information is correct.

Wales’ vision is a search engine run on the same principles as Wikipedia. I’m certain it will work, too, to an extent. But I see some synergy here between VortexDNA and the Wikia search project. Here’s why: with Wikipedia, there’s an underlying assumption that what people are after is facts, and that there are, after all, only one set of facts that apply to a given situation. Therefore, the more people (and experts) who view a page, and the more tweaks they make to it, the closer to the ‘facts’ that page will become.

Search is different. It has huge barriers in place between the searcher and the ‘facts’. IBM is trying to crack the image-search barrier with its Marvel technology. VortexDNA aims for a bigger barrier: the barrier of popularity. Google’s page-ranking system is essentially a highly complex calculation of popularity. A search engine powered by millions of Wikia members will also produce results based on popularity.

Popularity, though, is nothing more than an illusion. Surely you can think of something wildly popular that just doesn’t suit you (Harry Potter? Cricket? iPods?). Traditional search is a numbers game: you go for the masses and most people will go away happy. VortexDNA is taking a different approach: individuals should get different responses based on who they are, what their core purpose and values are, and what they, in particular, are drawn to.

Right now, the VortexDNA technology is being validated through the MyWebDNA plug-in for Firefox: you answer a short series of questions, a mathematical algorithm calculates a numerical profile, and, the next time you run a Google search, the plug-in circles the two answers most relevant to you.

Imagine the power of combining this type of personalization with the indexing and categorization of a Wikia search project. The Wikia model produces highly robust raw data, while VortexDNA guides you to those sites that you’re really going to care about.

Sounds like a match made in heaven to me.

BBC and IBM open the door to the Semantic Web

Tuesday, March 6th, 2007

Yesterday, the Guardian reported this:

The BBC has struck a partnership deal with IBM to develop “web 3.0″ technology, starting with a video search system for CBeebies and CBBC programmes, MediaGuardian.co.uk can reveal.

…The idea is that the system being developed with IBM, called Marvel, will deliver a mass of relevant images and videos when content is searched.

What’s this? Intelligent search of images? We all know that a computer can’t tell the difference between a picture of a dog and a picture of a rabbit.

Not true, claims IBM, which first announced the Marvel technology in late 2004. CNET described how the image search works:

Marvel largely relies on a technology called support vector machines, pioneered by Vladimir Vapnik at AT&T about a decade ago. In this type of artificial intelligence, a computer learns to assign the equivalent of a yes or no value to a piece of data. In other words, If the computer is supposed to distinguish between an indoor or outdoor scene, trees in a shot could well prompt the computer to put the clip in the outdoor bucket.

Do we believe them? It’s early days yet, but there are lots of people heading in that direction. Amazon’s Mechanical Turk is using humans to accomplish tasks like distinguishing outside trees from trees in pots.

This is what we’re after, what the Semantic Web is about. Intelligence. Search that understands you. Search that thinks the way you think. After all, the only way the utility of available information can even approximate keeping up with its growth is by making it more and more efficient to access the stuff you really need.

That’s where MyWebDNA comes in, too: same direction, different approach. It overlays a mathematical profile of your core values on a Google search page to circle the two results most relevant to you. Because the technology behind it is a universal measure of relevance, it could be developed to integrate with video search as well. Greater ability to categorize items + personalization = Semantic Web.

That, my friends, is where we’re headed. That’s the future of the Internet: responsive, intelligent, connected. Make no mistake, it will happen, and, when it does, we’ll all wonder how we ever survived without it.