Archive for the ‘Personalized search’ Category

MySpace’s Personalized Ads: 80%=$5 billion

Thursday, September 20th, 2007

The New York Times yesterday published a piece by Brad Stone entitled, MySpace to Discuss Effort to Customize Ads. In it, Brad unpicks the vast potential of using profile information to personalize ad service.

…MySpace, the Web’s largest social network and one of the most trafficked sites on the Internet, says that after experimenting with technology over the last six months it can tailor ads to the personal information that its 110 million active users leave on their profile pages.

Executives at Fox Interactive Media, the News Corporation unit that owns MySpace, will begin speaking about the results of that program this week. They say the tailoring technology has improved the likelihood that members will click on an ad by 80 percent on average.

“We are blessed with a phenomenal amount of information about the likes, dislikes and life’s passions of our users,” said Peter Levinsohn, president of Fox Interactive Media, who will talk about the program at an address to investors and analysts at a Merrill Lynch conference in Los Angeles on Tuesday. “We have an opportunity to provide advertisers with a completely new paradigm.”

That’s rather a long quote, so I’m going to repeat the bit that jumped out at me:

…the tailoring technology has improved the likelihood that members will click on an ad by 80 percent on average.

80 percent!

Remember how much everybody freaked out when Panama was shown to improve click-throughs on Yahoo! ads by 10%?

Back then, Jonathan Thaw was saying a 10% increase in click rate could translate into a 5% increase in revenue growth. So, ummm, if 10% equals 5%, then (let me just check my math with Miss Teen South Carolina here), such as, 80% could equal 40%? Which, such as, works out to more than $2 billion at Yahoo! and nearly $5 billion at Google.

I don’t need Don Dodge to tell me that a) this is an overly simplistic translation, b) Google and Yahoo would have to have access to the same extensive bank of personal information that MySpace does for each searcher in order to make it work, and c) I should call him for my math questions and just appreciate Miss Teen South Carolina for her beautiful heart and shiny white smile. No matter which way you look at it, these are big numbers we’re dealing with here.

Brad gets into the privacy issues on page 2:

MySpace also plans to give its advertisers information about what kind of people its ads have attracted. “We want them to leave knowing more about their audience then when they came into the door,” Arnie Gullov-Singh, vice president in the advertising technology group at Fox Interactive.

That is precisely the goal that worries some privacy advocates. They argue that users of social networks like MySpace and Facebook are not aware they are being monitored and that current ad-targeting is only the first step in what has become a huge arms race to collect revealing data on Internet users.

“People should be able to congregate online with their friends without thinking that big brother, whether it is Rupert Murdoch or Mark Zuckerberg, are stealthily peering in,” said Jeff Chester, executive director at the Center for Digital Democracy in Washington.

His organization will ask the Federal Trade Commission, during a planned hearing on Internet privacy in November, to investigate social networks for unfair and deceptive practices, he said.

This is definitely sensitive territory that MySpace is playing in, and they need to be careful. As I wrote yesterday, trust is worth more than gold on the Internet, and an 80% increase in clickthrough will mean nothing if there’s nobody there to see the personalized ads.

The reason this is so tricky is that MySpace members gave up the information in a context that had nothing to do with advertising.

In the NYT article, MySpace representatives were dismissive of the issue:

MySpace and Facebook executives argue that they are harming no one. They say that they are using information their members make publicly available, and contrast their ad targeting with efforts by Yahoo, America Online and Microsoft, whose advertising technologies follow people around the Web and try to deduce what they are interested in based on what sites they are looking at.

I think that’s a dangerous attitude to take, though. Given the potential reward, MySpace would be foolish to back off of a personalization program, but they need to get clear that retaining their audience trumps an increase in clickthroughs. A later comment indicates that perhaps they realize this:

Fox executives also say they are planning on letting users opt-out of the ad-targeting program on MySpace, though it means those members will see fewer relevant ads.

Smart move, but I think they’re doing themselves an injustice if they limit it to a simple ‘On-Off’ equation. We love to buy, but we don’t like to be sold to, and the primary difference is in how much control we have over the process. The more control MySpace gives its users over their ad targeting, the happier the users will be. Imagine a dashboard that offers me the ability to allow ads served based on the groups I belong to but not based on my individual conversations. Or that lets me indicate if I’m in ’shopping mode’. (My boyfriend will tell you that I am permanently in shopping mode, but that’s just not true.)

VortexDNA’s aim is to facilitate highly relevant personalization with complete control and total privacy. That is the triad—those three things are all equally important. It’s clear from the above article that MySpace understands the importance of personalization. I hope they manage to balance the triangle as well.

What do you think about their personalization efforts? Are you a MySpace user? Would you like to see targeted ads or would you feel they were intrusive? I’m eager to get your thoughts.

Personalization is where it’s at for e-commerce

Tuesday, August 21st, 2007

Meanwhile, over on E-Commerce News

Joe Lichtman is making some very valid points about personalization and e-commerce. He points to a report from Gartner that says brick-and-mortar merchandisers are personalizing inventory for their locations—’down to sizing and color choices’, and asks a rather reasonable question:

What struck me most about this report was that merchandisers in the offline world are personalizing their strategies in spite of the serious constraints working against them: supply chain complexity, marketing costs, shelf-space limitations and the like. Yet retailers are doing it. So, why do online retailers — who face none of these limitations — still struggle to present a truly personalized, dynamic shopping experience for each and every shopper?

Why indeed? He provides one answer to his query pretty much immediately:

Freed from supply chains, printing costs and shelf space limitations, online retailers’ product catalogs have ballooned. With widely expanded catalogs comes the challenge of presenting the right products and merchandising messages at the right time to each shopper.

What he could have said is, ‘Freed from supply chains, printing costs and shelf space limitations, e-commerce retailers have tried to become all things to all people.

The thing is, the Internet is pretty much the only place where a company can get away with trying to please everybody. Joe turns to Amazon as an example of a company that’s doing the right thing to tailor the customer experience:

Just as offline merchandisers are thinking in a customer-centric mindset, Amazon has created a complete customer-centric experience by building — in a sense — a micro-store for each and every customer… Everything about the Amazon experience is dynamic — not static — and becomes more personalized the more you shop.

Joe’s lesson is this: the only way you can please everybody is to please each person individually. A catalog of ten million items that you’re forced to wade through is not fun. A catalog of ten million items that pulls up the three items you’re likely to care about—now, that’s impressive.

What do you think? Should e-tailers be focused on delivering a dynamic, personalized experience? Or should it be up to each of us to find what we’re after?

Google’s making it personal, but not too personal

Wednesday, August 1st, 2007

Meanwhile, over on Yahoo! News… Eric Auchard is writing about Google’s unwillingness to tie together profiles across their massive collection of services for the purpose of serving up targeted ads.

What they are willing to do it use information that can be gleaned from within a given search session.

A user who types “Italy vacation” into a Google search box might see ads about Tuscany or cheap flights to Europe. Were the same user to subsequently search for “weather,” Google will assume there is a link between “Italy vacation” and “weather” and deliver ads tied to local weather conditions in Italy.

Given recent concerns about Google’s ever-growing collection of personal information, this is a pretty wise move. It essentially suggests to users that their unique data will be used to enhance services, while reassuring them that it won’t be given out to advertisers. Auchard comments on the tension between relevance and privacy:

In seeking patterns, Google’s plans involve tracking the various words typed in a given search session, as opposed to building a deeper user profile over time. The latter is known broadly as behavioral targeting, which has long been seen by many as the Holy Grail of the online ad business, but inevitably raises issues about personal privacy.

I support Google’s stance. Many may see behavioral targeting as the Holy Grail of the online ad business, but if people aren’t willing to participate, how useful will it be? Users have infinite choice these days, whether it’s blocking pop-ups or switching search engines. What’s the point of a behavioral system that serves to alienate the audience?

I think the Holy Grail is a system that delivers information uniquely appropriate to the individual without compromising their privacy concerns. What would that look like?

  1. It would be transparent.
    Just because people have privacy concerns doesn’t mean they don’t want personalized content. It means people want full knowledge of what data is being stored, how it is being used, and who has access to it. Without transparency, users have no ability to take informed action.

  2. It would keep the users in control.
    Hand-in-hand with transparency is control: the ability to turn personalization off, access it more fully, or delete data. Choice is paramount in order for users to feel comfortable providing information. Also along the lines of this item is opting-in: users should be able to decide they want something and get it rather than being forced to turn something off they didn’t want to begin with.

  3. It would be useful.
    If a service isn’t useful, why would people want it? If its only use is to the advertiser and not the consumer, why should the consumer care? Combining this with the previous point, if I have to waste time turning off services that aren’t useful because they’re given to me automatically, it won’t really matter that they’re transparent and give me control; I’m still going to be annoyed.

VortexDNA’s relevance services fit those three criteria—was that why I chose them? No… it’s a chicken-and-egg, I think. We developed VortexDNA the way we did because we believe privacy to be paramount.

What have I missed here? What would you add to create the Holy Grail of personalized search?

Relevance challenges for search engines

Monday, July 23rd, 2007

Hamlet Batista put up an excellent piece on SEOMoz a few days ago, entitled 7 Reasons Why Search Engines Don’t Return Relevant Results 100% of the Time. In it, he describes, funnily enough, the seven reasons why search engines don’t return relevant results:

  1. Relevance is subjective
  2. Natural language searches
  3. Poor queries
  4. Synonymy
  5. Polysemy
  6. Imperfect performance
  7. Spam

Batista really does a superb job of exploring each of these reasons; I’m just going to additionally touch on a few of them here.

1. Relevance is subjective
Let’s start with the first one, ‘relevance is subjective’. Batista describes it this way:

You can do a search for ‘coffee’ in Canada and find Tim Horton’s website as the most relevant. Makes sense, as that’s the most popular coffee chain in Canada, but for somebody in Seattle, Starbucks might be the most relevant result. You can do a search for the ‘49ers’ and be looking for the football team, but a historian may be looking for research material on California. And you might even do a search today for ‘bones’ trying to find where to buy your dog a treat, but tomorrow you do that same search looking for an episode of the TV series ‘Bones’ that you missed the night before.

…So far the best approaches the search engines have come up with are the use of human quality raters and personalized search. The better the search engines profile the searcher, the higher the chances of producing relevant results. This method obviously raises a lot of privacy concerns.

At VortexDNA, of course, we take the concept of subjective relevance a step further than location or job description; we suggest, and have shown, that it is profoundly affected by the user’s core purpose and values.

He also suggests that personalization inevitably leads to privacy concerns—only true for methods that rely on tracking history and demographics. When values are used to calculate relevance, there’s no need to track search history or clickstream.

2. Natural language searches
Next on Batista’s list is the use of natural language in search queries:

A search engine, on the other hand, receives ‘who has smith as last name in chicago’ or ’smith last name chicago’. The query is in natural language — our language.

Is it, though? When was the last time you spoke with a person and said, ‘Smith last name chicago’? I submit to you that we are far more demanding of our search engines than we are of any human being. Look at Batista’s examples under the previous point about ‘bones’ and ‘coffee’. Would you go t an information desk and ask, ‘Bones?’ When they’re put forth as search examples, though, we don’t question them; it’s in fact a highly plausible scenario for us to a word or two at a search engine and then be disappointed when they’re unable to disambiguate our queries.

That’s not natural language; it’s unreasonable expectations. It also leads into Batista’s next point:

3. Poor queries
His description of poor queries include colloquialisms (like ’sucker’ for vacuum) and misspellings. As I stated above, I think poor queries also includes minimalist terms and odd syntax. We couldn’t expect a human being to know what we were after with those words, but we do hope for a machine to guess our intent.

6. Imperfect performance
As I said, I’m only going to touch on some of the seven, so we’ll skip synonymy and polysemy and go straight to imperfect performance. Batista says that the two criteria that define search performance are precision and recall:

Precision is a measure of how efficient the search engine is in returning only the relevant results for the search. The more irrelevant results, the lower the precision. Recall, on the other hand, measures how good the search engine is in returning all the relevant results. (Of course, this assumes the researcher knows how many relevant results there are.) The more relevant results missing from the search, the lower the recall.

Ideally, a search engine should identify all relevant documents without returning any irrelevant ones (100% precision and 100% recall). In practice, this has been proven to be impossible, as precision and recall are inversely proportional.

It sounds like Heisenberg’s Uncertainty Principle, which states that

…it is impossible to perfectly measure a particle’s position and velocity at the same time. The more accurately you measure a particle’s position, the more inaccurate your measure of its velocity, and vice versa.

It may sound strange that I’m citing quantum physics when we’re talking about search engine performance, but I call parallels where I see ‘em, thank you very much.

The point here is not only that it appears to be impossible to achieve perfect precision and perfect recall simultaneously, but also that the aim should be to find the optimum tension or balance between the two. At what point does declining recall produce diminishing returns for incremental increases in precision, and vice versa?

I have some additional questions about these measures; namely, how precision and recall can be defined when we already know that relevance is subjective (see point 1). They do, however, serve as valuable parameters for putting search improvement efforts in context.

I really appreciate Batista’s skill in describing these seven challenges, and I believe we’re only scratching the surface here. What do you see as the biggest challenge search engines face in delivering the results you want?

Making personalized search more transparent

Friday, July 13th, 2007

At SMX last month, Michael Gray made the following plea to Google:

Be clearer on the SERPs when a result is there because of personalized search and not a normal result. If they’re that much better then why not highlight them?

Good question! That’s what we do, after all. If you’ve just stumbled upon this blog, you may not know that mywebDNA is a Firefox plug-in that circles the two Google results most relevant to the user. Here’s what it looks like:

Google results

There are, however, instances when someone might be logged into our system and we don’t circle relevant results. I certainly can’t speak for Google—I have no idea why they don’t highlight theirs—but I can tell you why we mightn’t.

Our technology was based on a hypothesis. The hypothesis is that our purpose and values can help us find more relevant search results.

This hypothesis came out of the understanding that who we are at our core impacts every other aspect of our lives.

When I say ‘who we are at our core’, I’m referring to purpose and values. As I described in an earlier post, my purpose in life is to be an uplifting presence. Yours might be to take care of animals, or to be financially comfortable, or to leave a legacy.

Because it was based on a hundred years of science, we had a pretty good idea that our hypothesis would hold up when people started to use the technology. But, of course, it would have been pretty shoddy of us to put up a plug-in and just say, ‘Trust us! This will work!’ First of all, we had taken what we had learned to another level, and we had to make sure that our hypothesis would hold up. Second, we had to make sure that our algorithms and technology would hold up—after all, it could have been possible for the hypothesis to be accurate and not the technology.

So we went through a validation phase. During this phase, we measured how likely people were to click on a mywebDNA-recommended link versus one that the technology didn’t recommend. This validation allowed us to quantify how much mywebDNA could increase the relevance of search results.

The thing is, though, that if our mywebDNA plug-in had just gone along, merrily circling results, the chances are people would have been more likely to click on the circled links. I mean, they just look more special. In order for our validation to hold water, we had to be able to show that we weren’t creating a self-fulfilling prophecy.

So some of the time we didn’t circle the link, and we measured whether users were more likely to click on a link that mywebDNA thought was relevant even if they didn’t know mywebDNA thought it was relevant.

We found that they were, which meant that mywebDNA was accurately able to predict whether a search result would be relevant to a particular user.

I agree with Michael Gray that it makes sense for personalized search to be more transparent. Wouldn’t you want to know that this result is the one that you personally are likely to be interested in? Especially if you’ve found the technology to be accurate at serving up relevant content?

Would you rather know if you’re seeing personalized results? Or do you think it doesn’t much matter one way or the other?

A new definition of relevance

Monday, July 9th, 2007

I just left a comment on Danny Sullivan’s article about FaceBook’s claim to be the most used people search engines on the web. My comment was about the difficulty of defining a new technology when people have no frame of reference for it, a challenge that we face regularly at VortexDNA.

The reason that we stumble with the definition is not necessarily because what we do is so impossible to understand. Like I said, it’s just that it’s outside the frame of reference. So all of the usual terms we would use instantly evoke meanings that are not appropriate to us.

Let’s start with the basics: VortexDNA is a recommendation engine. We are not a search engine.

Our technology can be integrated into search engines and ecommerce sites like www.NetFlix.com to deliver more personally relevant search results. So what do you think people think of when they hear that? According to Alex Iskold, they think of personalized recommendation (based on past behavior), social recommendation (similar users), item recommendation (items that go hand-in-hand), or any combination thereof.

Well, we are personalized. But we don’t base our recommendations on past behavior. There is a social recommendation aspect to what we do, but it’s not as straightforward as Alex paints it. Nor do we work on a direct item-to-item link.

We need a new frame of reference. I’m famous for being able to explain complicated things, but I don’t think I’ve done my job properly in this case. I’ll give it one more shot.

The VortexDNA relevance model
Imagine if every person, service, website, and object in the world was given a number, from 1 to 100.

You’ve got number 6.

In this imaginary world, people and things with the same number are relevant to each other. So if you do a search, everything will come up 6, because that’s what’s relevant to you. It doesn’t have to track your history though; it just needs to know that you’ve got number 6. That’s the personal recommendation side.

And how do the numbers get given out? Well, people who are 6 people just are, in the core of their being—it’s the most fundamental expression of what drives them. But it’s not as easy for things. So what happens is that things start to pick up the numbers of the people who are interested in them. If lots of number 6 people buy a book, it becomes a number 6 book. That’s the social recommendation side.

Now, if I were Amazon, and I didn’t know that you had number 6, but you bought three number 6 books, I might recommend you other number 6 books. It wouldn’t matter whether anyone had ever bought that particular combination of books before. That’s the item recommendation side.

This explanation is kind of a childish-sounding attempt, and here is my request: tell me if I’m not making sense. Tell me if you have a better, more simple, more graspable way of conveying what it is our nascent technology does. Most importantly, tell me what you think it could do for you, or not do for you, in your capacity as consumer, SEO, web marketer, ecommerce site or search engine owner. I will be mighty grateful for any input you wish to offer.