Archive for the ‘Semantic Web’ Category

Web 3.0 searches for meaning

Saturday, March 17th, 2007

When I first started writing about Web 3.0 not so very long ago, I was surprised by the amount of ire people had towards the phrase ‘Web 3.0′. Tired of hype, sick of overblown promises of the Next Greatest Thing, faithful Web 2.0 users patrolled the Net, ridiculing or removing premature references to yet another sequel.

As Thomas Claburn reported in Information Week this past Thursday, Web 3.0 has won its most recent battle:

Up until last month, Web 3.0’s future was in doubt. Wikipedians were divided about the legitimacy of the concept, and those skeptical of the term deleted the Web 3.0 entry from the online encyclopedia five separate times during 2006. After this series of near-death experiences, the article was put under protection last October…

In February, a deletion review for the entry concluded, with the majority of Wikipedia contributors voting to accept the legitimacy of the term.

The watershed moment may well have come last November when New York Times reporter John Markoff legitimized Web 3.0 in an article that described the term as a movement to add meaning and structure to the vast amount of information on the Web.

Some may say that this argument is about semantics, not substance. Substance is created from semantics, though. Pulitzer Prize winner William Safire is certainly skilled and knowledgeable enough to write on just about any topic; he has chosen to spend the past 28 years writing the “On Language” column in the New York Times Sunday Magazine. I suspect that he better than anyone understands that the person who defines the words is the one who holds the cards.

I did look it up to see what Merriam-Webster had to say:

se·man·tic (si-’man-tik) adjective of or relating to meaning in language

What could be more substantial than meaning?

In the case of Web 3.0, people are irritated by the sheer cheekiness of it, coupled with the lack of a clear, tangible definition that everyone can grasp. Among the many positive responses to Stephen Baker’s post about Web 3.0 last October were some that were positively vitriolic, such as:

If people don’t stop using this useless marketing term: “Web 2.0″, let alone plugging a “newer, improved” Web 3.0 term, then the world will probably implode.

and

Don’t you think we could finish wrapping our heads around and implementing web 2.0, learning about is standards and methods before you go on jibbering about web 3.0?

What are the other options, though, to refer to web apps that successfully integrate relationships, allowing, as Claburn pointed out, the possibility that

someone querying a Star Wars database that supports semantic protocols could search for “Darth Vader’s son’s sister” and would find documents relating to Princess Leia, despite the absence of that specific phrase in any of the found documents?

The only one that seems to have gained any traction is Tim Berners-Lee’s ‘Semantic Web’, which may take off because it sounds intelligent. The issue with that particular turn of phrase is that some people don’t know what it means, and others—the ones who like to say, “Now you’re just arguing semantics,”—imbue it with a negative connotation.

Won’t you help us, Mr Safire?

Freebase connects, but don’t forget the users

Wednesday, March 14th, 2007

Last week, John Markoff had this to say in The New York Times:

A new company founded by a longtime technologist [Danny Hillis] is setting out to create a vast public database intended to be read by computers rather than people, paving the way for a more automated Internet in which machines will routinely share information…

On the Web, there are few rules governing how information should be organized. But in the Metaweb database, to be named Freebase, information will be structured to make it possible for software programs to discern relationships and even meaning.

For example, an entry for California’s governor, Arnold Schwarzenegger, would be entered as a topic that would include a variety of attributes or “views” describing him as an actor, athlete and politician — listing them in a highly structured way in the database.

A highly interesting concept, indeed. Certainly interesting enough to be picked up by Web guru Tim O’Reilly, who commented:

If Metaweb gets this right, this bottom up approach will build new connections between data, new categories and ways of thinking. It will likely be messy and contradictory for a while, but … they are building new synapses for the global brain.

One of the beauties of the Freebase proposition is its simple complexity. By allowing computers to do all the back-filling, the database grows exponentially faster. I don’t have to go to the New York page and say I’m from there and then to the New Zealand page and say I live there. I go to my page, say I’m from New York and live in New Zealand, and, bada bing bada boom, all three pages get updated.

The other, more important, beauty is Freebase’s recognition that relationships are as important, if not more so, as the things themselves. We as humans can only hope to understand the world through relationships. Think about how you describe, well, ANYTHING: “She’s Joe’s ex-girlfriend.” “The theater is two blocks west of State Street.” “This new database is better than the old ones.”

Relationships, always relationships. Even when we think we’re perceiving things independently, we’re comparing. “That person is skinny.” Relative to whom? “The movie was great.” In whose opinion?

If a tree falls… my friends, in the polarized field of our existence, we can only be in relation to something else.

So, yes, I applaud Freebase and Danny Hillis. As with Wikia’s search effort, though, Hillis remains focused on using the user in order to create the best possible database. The focus is on the database.

But the user is the reason for our work. The user is who benefits from the database. The user is at the center of the semantic web.

The reason the focus is on the database is because, as complex as it is to create, it’s still easier than understanding what drives a human being, which is, of course, the unique value proposition of VortexDNA.

I’m sure Danny Hillis and Jimbo Wales and Tim O’Reilly, none of whom is any slouch in the intellectual department, can see the possibilities of adapting search to the user. Between all of us, we’re nibbling at the edges of the semantic Web, nosing at the door. In the warped space-time continuum that is Internet development, it will be mere moments before our visions are reality.

Let’s just make sure we bring the user with us.

BBC and IBM open the door to the Semantic Web

Tuesday, March 6th, 2007

Yesterday, the Guardian reported this:

The BBC has struck a partnership deal with IBM to develop “web 3.0″ technology, starting with a video search system for CBeebies and CBBC programmes, MediaGuardian.co.uk can reveal.

…The idea is that the system being developed with IBM, called Marvel, will deliver a mass of relevant images and videos when content is searched.

What’s this? Intelligent search of images? We all know that a computer can’t tell the difference between a picture of a dog and a picture of a rabbit.

Not true, claims IBM, which first announced the Marvel technology in late 2004. CNET described how the image search works:

Marvel largely relies on a technology called support vector machines, pioneered by Vladimir Vapnik at AT&T about a decade ago. In this type of artificial intelligence, a computer learns to assign the equivalent of a yes or no value to a piece of data. In other words, If the computer is supposed to distinguish between an indoor or outdoor scene, trees in a shot could well prompt the computer to put the clip in the outdoor bucket.

Do we believe them? It’s early days yet, but there are lots of people heading in that direction. Amazon’s Mechanical Turk is using humans to accomplish tasks like distinguishing outside trees from trees in pots.

This is what we’re after, what the Semantic Web is about. Intelligence. Search that understands you. Search that thinks the way you think. After all, the only way the utility of available information can even approximate keeping up with its growth is by making it more and more efficient to access the stuff you really need.

That’s where MyWebDNA comes in, too: same direction, different approach. It overlays a mathematical profile of your core values on a Google search page to circle the two results most relevant to you. Because the technology behind it is a universal measure of relevance, it could be developed to integrate with video search as well. Greater ability to categorize items + personalization = Semantic Web.

That, my friends, is where we’re headed. That’s the future of the Internet: responsive, intelligent, connected. Make no mistake, it will happen, and, when it does, we’ll all wonder how we ever survived without it.