Add this blog to Del.icio.us, Digg or Furl | Create Watchlist for this blog

Archive for the ‘Internet’ Category

freebase - let the community build a global ontology

Friday, June 8th, 2007

With the help of Thomas I got invited to the freebase alpha program. In a short sentence I would say, that freebase is the semantic extension of wikipedia. That means, that freebase does not only have “objects” like wikipedia, it additionally has a hierarchy for these objects, relations between the objects, and common object attributes.

I.e. freebase “knows”: Austria is a country, Vienna is a city. A city is contained in a country - for Vienna, Vienna is contained in Austria. In Austria the main language is german. Goes what lanuage is spoken in Vienna…
freebase knows it! Got the point?

So, setting up the “Ontology” on freebase is a little bit like object oriented analysis. And it is a huge job to do for all the world’s knowledge!

Like Tim O’Reilly wrote:

But hopefully, this narrative will give you a sense of what Metaweb is reaching for: a wikipedia like system for building the semantic web. But unlike the W3C approach to the semantic web, which starts with controlled ontologies, Metaweb adopts a folksonomy approach, in which people can add new categories (much like tags), in a messy sprawl of potentially overlapping assertions.

It is possibly the advantage of freebase to have this messy sprawl of assertions - and with this sprawl the support of all the editors of the “collective intelligence“. The other (MIT) approach is very academic (surprise, surprise) and hard to follow (RDF, GRDDL, SPARQL, ITL, Microformats…). They spend a lot of work in the representation of an ontology - but it seems to be time just to start, to define some interweavings. Who is actually using DublinCore now?

When the collection of data is done, all the relations are set, how can we use the freebase?

freebase offers an open and free-of-charge API (MQL (like JSON) via HTTP) to use the data under common creative license where ever you want.

Questions:

  • freebase is driven by the company Metaweb, what happens to the data when something happens to Metaweb, i.e. getting bought by Google?
  • when lot’s of pages rely on the freebase service, how will they pay the traffic costs?

Yahoo! at a loose end

Thursday, June 7th, 2007

About the new Yahoo! Panama API.

When I read the interview with Yahoo!’s Dan Broberg, Managing Director of Sales Technology on Alan’s Blog, I got a feeling that yahoo!’s running after Google.

For “advanced” support, we are not charging all that much, $2000 per month. This level provides our partners with more support, more dedicated Yahoo engineering resources should they need support. At the advanced and elite levels, we’ll make specific commitments regarding uptime, and provide our partners with in our product roadmap process.

They sell support for using their API - that’s not a 2.0 approach! That’s old IT business. A 2.0 approach would be to have a FAQ + a forum + a wiki + a developer blog - and all of that for free.

APRK: Google has taken a different approach. They don’t charge for support, but rather charge a nominal fee for each API action, 25c per thousand API tokens.

DB: We’re interested in best serving our advertisers, and we differentiate from our competition when it makes sense.

Is this eloquent marketing speech? Well, sure it makes sense for the selling company to charge a monthly fee instead of charging per action.
But to be fair, the free of support use is free of charge :-)

What does this change next few months? Can’t see any effects now, specially cause Google bought YouTube, DoubleClick and FeedBurner lately, there is more effort necessary to change the commercial search market than a “free API” (may be as stable as a rock). Is it a step in the AdWords direction? But AdWords is for free also, and I am not quite sure to earn more from Yahoo! adclicks than I do from Google’s.

Do I?
Is Yahoo! on the loose?

By the way, what do they do to their directory?

Google Gears makes me wonder

Thursday, May 31st, 2007

Today I read a lot about the new “Google Gears” project. And it makes me wonder what google is doing here. Google Gears is a mix of a runtime-environment and an API to develop applications that run online as well as offline.

At a first impression it reminds me for old battles like with “Lotus Notes” where one could synchronize online and offline data. Simple for EMails but horrible for i.e. documents and complicated database content.

So why is google joining this business? My only explanation is, that they want to support their online application market (like their google-docs and picassa) by adding strong offline and synchronization capabilities. Making web applications run offline in a browser is one more step away from a single-user desktop PC - and from the OS needed here.

With their runtime-environment, google adds like an apache, and a mySQL Server to your PC. To synchronize the data on your local PC with your host system, google suggest a online synchronization via ajax technics (see here).

I will examine this in detail later because it is interesting. But for this time, I can not see how google will handle the more complicated problems of synchronization (like missed merges, concurrency problems, huge amounts of data …).

Google - the knowledge and the power

Saturday, May 12th, 2007

Google ist extending their power in a very new way. They redefine searching in the internet by giving pages value depending on how many people visit the page, where they come from and how long they stay.

How do they do this? The key is their Analytics -Knowledge. Because Google-Analytics is the very best website analysis tool - best in content, best in drilling down, best in installation effort, best in cost, best in speed - everybody uses it. And because every webmaster is using Google-Analytics Google knows where the webusers go to.

By using simple statistics Google can relay on their knowledge by at least - let’s say 95%. They know how many websites exist, and they know how many websites are using Analytics. The rest is simple statistics.

So, while all the SEO’s are staring at the pagerank, Google is redesigning their search algorithm. Important is what attracts people, valuable is what keeps them staying a long time, sustaining is what makes them come back. It will be a little bit like yellow press.

In the end, I think all over valuable content for the masses will make the race to the Google Top Ten. What is still necessary to make it perfect for Google is some semantic knowledge. I am keen on Google’s next “free-of-charge” product.

In any case the losers will be msn, yahoo, altavista… they don’t have the Analytics, their search results will stay miserable.

Google counts redirected links

Tuesday, May 1st, 2007

After I was registering for Google’s sitemaps program, I was surprised that I found indirect links to my pages in the external links index. As you know, the old “link:www….” syntax has been disabled by google (does not show all the database secrets anymore) cause of too much abuse by “who-knows-who’s”.

So, if you want to know, what sites are really linking in, you have to subscribe to the sitemaps program. I have lots of indirect links (links working with the http-redirect, for example for click-counters) to my open source projects in software catalogs. Never thought they would count anything. But surprise, they are in the google database.

Hot question: Do they count in PR calculation?