Archive for July, 2007

Defining relevance in search

Tuesday, July 24th, 2007

In yesterday’s post, I discussed Hamlet Batista’s seven reasons why search results aren’t more relevant. Today, I want to explore the heart of the question a bit further, namely:

What is relevance in search?

This is an important question, particularly for us here at VortexDNA, since the purpose of our technology is to improve relevance. If we’re not totally clear on what relevance is, though, how can we do our job?

In addition, although the definition of relevance might seem obvious, there’s some intrigue there if we dig a bit deeper. I commented yesterday that I agreed with Batista’s first and most important point:

Relevance is subjective.

If we’re in agreement on that (and if any of you disagree with the above statement, I’d love to hear from you), then how can we define it? It’s like saying you have a company that can increase beauty in the eyes of the beholder.

If relevance is subjective, then how can you claim that the ideal would be 100% precision (efficiency in returning only relevant results) and 100% recall (inclusion of all relevant results)?

In fact, how could you ever begin to measure recall? It would be crazy. First, you’d have to get all possible results in the world for a particular search, whether or not they’re included in the results generated by a particular search engine. Then you’d have to find out which of those results are relevant—and because relevance is subjective, that answer will be different for every user. Finally, you’d have to check all relevant results for a particular user against the complete set of results returned by a given search engine. That comparison would only give the recall for that query/user combination.

No wonder Batista says users care more about precision than recall.

So what is relevance? A good old Google ‘define:’ search (sorry, Charles) yielded a stack of results, all of which had one thing in common:

Relevance is only relevance in relation to something or someone.

Personally, I believe that you can edit that sentence down further: relevance is only relevance in relation to someone. It always comes back to people. I don’t care that a given answer is relevant to my search string; in fact, it’s entirely likely that my search string isn’t the most appropriate one for the information I’m after. Are the results relevant to me? That’s what I’m after.

A while back I wrote about discovery in search. (Thanks again, David Berkowitz, for prompting the post.) Discovery is essentially that quality of being able to find what you didn’t know you were looking for, and it’s one of the driving forces behind the evolution of the Web.

Think back, and try to remember how much you had been able to anticipate what you looked at online in the past week. Was any of your Web activity happenstance? A link sent by a friend to a site you didn’t know you’d visit? A sideways reference from one page to another? A search result you didn’t expect yet thoroughly enjoyed?

Discovery has another impact on increasing relevance. If people don’t know what they’re after, how can search engines know whether or not they’re delivering it?

Now let’s combine thes two concepts: people don’t always know what they’re looking for, and what we want is different for each of us. So what is relevant? If we limit relevance to answers directly related to the specific query we’ve entered, we eliminate discovery. Limiting relevance to discovery would make search a free-for-all. I suggest that relevance could be defined as a set of qualities:

Qualities of relevance

  1. The human connection: relevant results connect to the searcher.
  2. The discovery angle: relevance can be expected or unexpected.
  3. The subjective nature: the degree of relevance changes from person to person and moment to moment.
  4. The measurement conundrum: the degree of relevance occurs along a spectrum that makes it impossible to achieve 100%.

What does the above mean for companies who work in the relevance space? It means that any technology that aims to improve relevance must be able to address its various facets: it must be able to deliver results that connect with the individual user, at that very moment the user is looking, and whether the user expects to find the results or not.

It also means that, like so much in life, perfection is unattainable. Our aim is to move incrementally and continually along the spectrum.

What do you think is the most important aspect of relevance? And do you think I’ve missed some of its qualities? I’d be delighted to hear from you, agreement and disagreement alike.

Relevance challenges for search engines

Monday, July 23rd, 2007

Hamlet Batista put up an excellent piece on SEOMoz a few days ago, entitled 7 Reasons Why Search Engines Don’t Return Relevant Results 100% of the Time. In it, he describes, funnily enough, the seven reasons why search engines don’t return relevant results:

  1. Relevance is subjective
  2. Natural language searches
  3. Poor queries
  4. Synonymy
  5. Polysemy
  6. Imperfect performance
  7. Spam

Batista really does a superb job of exploring each of these reasons; I’m just going to additionally touch on a few of them here.

1. Relevance is subjective
Let’s start with the first one, ‘relevance is subjective’. Batista describes it this way:

You can do a search for ‘coffee’ in Canada and find Tim Horton’s website as the most relevant. Makes sense, as that’s the most popular coffee chain in Canada, but for somebody in Seattle, Starbucks might be the most relevant result. You can do a search for the ‘49ers’ and be looking for the football team, but a historian may be looking for research material on California. And you might even do a search today for ‘bones’ trying to find where to buy your dog a treat, but tomorrow you do that same search looking for an episode of the TV series ‘Bones’ that you missed the night before.

…So far the best approaches the search engines have come up with are the use of human quality raters and personalized search. The better the search engines profile the searcher, the higher the chances of producing relevant results. This method obviously raises a lot of privacy concerns.

At VortexDNA, of course, we take the concept of subjective relevance a step further than location or job description; we suggest, and have shown, that it is profoundly affected by the user’s core purpose and values.

He also suggests that personalization inevitably leads to privacy concerns—only true for methods that rely on tracking history and demographics. When values are used to calculate relevance, there’s no need to track search history or clickstream.

2. Natural language searches
Next on Batista’s list is the use of natural language in search queries:

A search engine, on the other hand, receives ‘who has smith as last name in chicago’ or ’smith last name chicago’. The query is in natural language — our language.

Is it, though? When was the last time you spoke with a person and said, ‘Smith last name chicago’? I submit to you that we are far more demanding of our search engines than we are of any human being. Look at Batista’s examples under the previous point about ‘bones’ and ‘coffee’. Would you go t an information desk and ask, ‘Bones?’ When they’re put forth as search examples, though, we don’t question them; it’s in fact a highly plausible scenario for us to a word or two at a search engine and then be disappointed when they’re unable to disambiguate our queries.

That’s not natural language; it’s unreasonable expectations. It also leads into Batista’s next point:

3. Poor queries
His description of poor queries include colloquialisms (like ’sucker’ for vacuum) and misspellings. As I stated above, I think poor queries also includes minimalist terms and odd syntax. We couldn’t expect a human being to know what we were after with those words, but we do hope for a machine to guess our intent.

6. Imperfect performance
As I said, I’m only going to touch on some of the seven, so we’ll skip synonymy and polysemy and go straight to imperfect performance. Batista says that the two criteria that define search performance are precision and recall:

Precision is a measure of how efficient the search engine is in returning only the relevant results for the search. The more irrelevant results, the lower the precision. Recall, on the other hand, measures how good the search engine is in returning all the relevant results. (Of course, this assumes the researcher knows how many relevant results there are.) The more relevant results missing from the search, the lower the recall.

Ideally, a search engine should identify all relevant documents without returning any irrelevant ones (100% precision and 100% recall). In practice, this has been proven to be impossible, as precision and recall are inversely proportional.

It sounds like Heisenberg’s Uncertainty Principle, which states that

…it is impossible to perfectly measure a particle’s position and velocity at the same time. The more accurately you measure a particle’s position, the more inaccurate your measure of its velocity, and vice versa.

It may sound strange that I’m citing quantum physics when we’re talking about search engine performance, but I call parallels where I see ‘em, thank you very much.

The point here is not only that it appears to be impossible to achieve perfect precision and perfect recall simultaneously, but also that the aim should be to find the optimum tension or balance between the two. At what point does declining recall produce diminishing returns for incremental increases in precision, and vice versa?

I have some additional questions about these measures; namely, how precision and recall can be defined when we already know that relevance is subjective (see point 1). They do, however, serve as valuable parameters for putting search improvement efforts in context.

I really appreciate Batista’s skill in describing these seven challenges, and I believe we’re only scratching the surface here. What do you see as the biggest challenge search engines face in delivering the results you want?

Google’s Norvig sees same future as VortexDNA

Friday, July 20th, 2007

Peter Norvig, Google’s director of research, gave us a glimpse on where Google is focusing its efforts for the future in an interview with Kate Greene from MIT Technology Review this week:

The core of what we do is still search and advertising. A lot of researchers are working on that. They’re working to give better-quality search results and to match ads better. Another area of research is gathering more sources of information, such as text in books, still images, video, and now audio in terms of speech recognition. I think another focus is to understand how people interact with Google and interact with each other on the Web, in general. How do people operate in these social networks? Understanding that question can help us serve them better.

These statements don’t reveal anything new or secret about Google, but they do reinforce what we’ve been saying for some time: the relevance arena, which is where search quality and ad matching live, is a vital and vibrant piece of the search equation, and there’s still a lot of room to grow.

His point about understanding how people interact with Google and each other on the Web is also an important one, and I’m glad to see that he included it in his initial statement. In The first principle of search relevance, I discussed the need to focus on the user in order to deliver true relevance:

What is relevance, if not caring? What is relevance, if not a reflection of the user’s needs, wants, and values? Without caring about the user, our search for relevance would fall dramatically short.

It’s great that someone as senior as Norvig is reinforcing that idea. The purpose of technology is not technology. The ability to do things differently is not an imperative to do so. At the end of the day, the question is and must continue to be: how will this affect people?

One of the things I love about mywebDNA is that it brings those two concepts, relevance and people, together on so many levels. With mywebDNA, the users are the filter that determines relevance.

I recently did one of those team-building exercises where we got split into groups of three and given a bucket, a bit of pvc pipe, and two balls. Our first challenge was to use the pipe to try to whack the balls into the bucket. Our second was to use the bucket and the pipe any way we wanted to get the balls into the bucket. The second was easier, of course, and the message was that these two things made it easier:

  1. Usability, and
  2. Control.

It’s important to note that when I talk about relevance, I’m not suggesting we should only be allowed to see certain things. We should be able to see anything we want! We should have access to a Web that allows us excellent usability and total control over our own experience.

The fact remains that, at the end of the day, we’ll only be able to access the minutest sliver of what the world has to offer us, online or otherwise. In economics, people talk about scarce resources; in life, our attention is the scarcest resource of all. Personally, I like to direct mine towards things I care about.

Google does a brilliant job at indexing, ranking, and (if you’ve got it turned on and aren’t opposed to it on moral grounds) personalization based on geography and demographics. Those calculations spit out a result that they hope will appeal to you at the deepest level. With mywebDNA, though, who you are at the deepest level serves as the filter; relevant Google results are circled because you are who you are.

Later in the piece, Greene asks Norvig what the outstanding problems in search are. He responds:

In general, we think there are two aspects of it. One is understanding users’ needs more. The other is understanding the contents of documents, whether they be Web pages or video. Mostly we look at what the user types in, treat the input as individual words, and count them up on pages and weigh those pages with different kinds of evidence. But we don’t look only at words they type in. We also look at spelling variants, and if a user types in a long query, we break it into pieces. Maybe a user meant some words, but didn’t really mean others.

Here is a simple three-step equation:

  1. Understand users more.
  2. Increase relevance.
  3. Serve them better.

Do you think this is a valid goal?

A bit of shameless self-promotion

Friday, July 20th, 2007

I have had the good fortune recently to come across some editors who were kind enough to allow me to rant a bit on their sites as well as my own. These brave souls have earned my eternal gratitude by permitting me to contribute the pieces below. Make sure to visit their sites to see what else they have to offer:

AltSearchEngines.com—three pieces to date:

  1. A Private Interview with Faroo—check out the fascinating world of P2P search.
  2. It’s Not Easy Being Green—search engines take environmental responsibility seriously.
  3. This Post Is Rated ‘R’—a look at how alternate search engines handle adult content.

I also wrote a piece for Email Insider called Want Passive Revenue? Give Your Product Away.

The blogosphere is one of the most collaborative environments I’ve ever worked in, and it’s really a delight. People contribute articles to each other, share links and comments, advise, encourage, and generally participate. There’s no one-way delivery of information here; it’s a community, and I love being a part of it.

Semantic Web Part II: The Interface

Wednesday, July 18th, 2007

Last week, I wrote a post describing the data behind the Semantic Web. The basic premise was that the Semantic Web breaks all of our information down into little bits that we can manipulate as much as we want. As I’ve said before, though, having that much data can be messy. In order for the Semantic Web to be able to serve us rather than the other way around, we need an effective interface that allows us to navigate intuitively through a sea of infinite information.

The choice paradox
We tend to think choice is good, and the more choice the better. Our experiences, however, tell us otherwise. In the book ‘Blink’, Malcolm Gladwell tells the story of an experiment conducted by Sheena Iyengar:

She once conducted another experiment in which she set up a tasting booth with a variety of exotic gourmet jams at the upscale grocery store Draeger’s in Menlo Park, California. Sometimes the booth had six different jams, and sometimes Iyengar had twenty-four different jams on display. She wanted to see whether the number of jam choices made any difference in the number of jams sold. Conventional economic wisdom, of course, says that the more choices consumers have, the more likely they are to buy, because it is easier for consumers to find the jam that perfectly fits their needs. But Iyengar found the opposite to be true. Thirty percent of those who stopped by the six-choice booth ended up buying some jam, while only 3 percent of those who stopped by the bigger booth bought anything. Why is that? Because buying jam is a snap decision. You say to yourself, instinctively, I want that one. And if you are given too many choices, if you are forced to consider much more than your unconscious is comfortable with, you get paralyzed. Snap judgments can be made in a snap because they are frugal, and if we want to protect our snap judgments, we have to take steps to protect that frugality.

Think of a situation in which the amount of consideration that had to go into a purchase seemed overwhelming—car-buying is a good one for many people. I remember the very first car I bought. I agonized over it for weeks. In the case of cars, the paralysis is for a different financial reason than it is for jam: you’re spending a good amount of money, and so you want to be sure to make the right decision. But the level of expertise required to make a ‘right’ decision in such a vast realm of choices leaves most of us feeling uninformed and uncertain about our ultimate selection. This, I suspect, is where ‘buyer’s remorse’ comes from: the inability to be sure that one option among millions is correct.

Not to have choice is anathema to most of us. Too much choice can paralyze us. Clearly, there’s a bell curve here, a point at which users and consumers have an optimum number of choices: enough that they can feel independent, not so much that they become despondent.

The paradox of choice will become increasingly important in the context of the Semantic Web. If all of the information on the Web is available as infinitely manipulatable data, how can we find the optimum point on the choice bell curve? I believe the answer lies in the interface.

Visible choices
A user interface is the means by which people can access the choices available to them. It essentially offers the user that part of the iceberg that is above water. By selectively revealing options that reveal more and more as the user progresses, an interface can deliver infinite choice in appropriate-size chunks for users to process.

Wizards (like mail merge wizards) are a great example of how a user interface can be used to make many choices seem manageable. I’ve been managing the development of a web tool that allows users to create promotional pdfs based on pre-set templates. The users have 36 possible templates to choose from, but our tool doesn’t show them 36 templates, or 18, or six. It shows them three. Then it asks if they want black & white or color. Then it asks which of three themes they want. Then it asks if they want a bold or minimalist design. Each of these questions is a ‘jam choice’, a snap decision from a small pool of options. Taken together, though, they lead the user comfortably down a path towards a confident decision.

If all of us who create the Web—every site owner, developer, widget maker, and blogger—adapt our material to the standards-compliant RDF framework, the fertile entrepreneurial ground will lie in creating interfaces that allow users to access the richness that is the Semantic Web without losing their minds. Sites like FaceBook encounter the challenge with thousands of personalization options; Google, inasmuch as they’ve got News and Blogs and Images along the top, are still, at their core, a single page with a simple text box. The lack of choices on the page allows users to feel comfortable with the infinite choices available in the query.

I had intended to touch on VortexDNA’s contribution to the interface in this post, but I suspect it’s getting rather long-winded, so I shall park it for another day. For now, how much choice do you want in your personal Web experience, and what do you think is the best way for providers to address varying preferences?

Google cookies die, rise from the ashes

Tuesday, July 17th, 2007

Peter Fleischer, Google’s Global Privacy Counsel, announced a new expiration policy for Google cookies on the company’s official blog today:

In the coming months, Google will start issuing our users cookies that will be set to auto-expire after 2 years, while auto-renewing the cookies of active users during this time period. In other words, users who do not return to Google will have their cookies auto-expire after 2 years. Regular Google users will have their cookies auto-renew, so that their preferences are not lost. And, as always, all users will still be able to control their cookies at any time via their browsers.

The announcement comes a month after the G-Monster capitulated to pressure from the European Article 29 Working Group and agreed to anonymize its server log files after 18 months, rather than the previously vague 18-24 months.

Well done, Google! As Fleischer points out, the purpose of cookies is to remember your preferences, like language and number of results per page. If you’re continually having to reset them, you’d get pretty frustrated pretty quickly. To my way of thinking, the search kings have struck a balance between serving the customer need for unnecessary re-entry of basic info and serving the customer need for privacy. If you haven’t been back to Google for two years, your preferences have probably changed anyway.

There is an inherent tension between making the experience smoother and protecting the privacy of the users, and the more options that people have to control the customization of their experience, the more they’ll be able to find their own happy mediums (media?). I remember visiting cnn.com several times over a period of a few months, and every time I’d get that stupid pop-up that asked me if I wanted the U.S. or international version. I gladly allow cookies that stop me from getting the same question over and over.

Fleischer makes another point in his post when he says that people can always control their cookies through their browser. I accept this as valid, but there is another side to it. Essentially, the message is that you can control your Internet experience only if you’re savvy enough to do so, begging the question: whose responsibility is it to ensure your privacy?

It reminds me of The Hitchhiker’s Guide to the Galaxy, Arthur Dent complaining to Mr. Prossard that he wasn’t told about the plans to demolish his house to make way for a bypass:

“But Mr. Dent, the plans have been available in the local planning office for the last nine months.”

“Oh yes, well, as soon as I heard I went straight round to see them, yesterday afternoon. You hadn’t exactly gone out of your way to call attention to them, had you? I mean, like actually telling anybody or anything.”

“But the plans were on display…”

“On display? I eventually had to go down to the cellar to find them.”

“That’s the display department.”

“With a flashlight.”

“Ah, well, the lights had probably gone.”

“So had the stairs.”

“But look, you found the notice, didn’t you?”

“Yes,” said Arthur, “yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard.’”

I’m having a bit of fun at Google’s expense here, but my question is sincere. For me, and most likely for you, changing our cookie preferences is child’s play. For my mom? Or your Auntie Dolores? We might as well ask them to whip us up a working prototype of the space shuttle.

And, again, I’m not sure that this is an issue. To be honest, even if you put ‘cookie settings’ right smack in the middle of the home page, there are lots of people who still wouldn’t begin to know what to do with them.

What do you think? Do you think that expiration after a two year lag in activity is enough to protect the interests of those who couldn’t find their cookie settings with both hands? Or should the default be privacy overprotection, with people being given the control and authority to reveal more as they see fit?