donderdag 25 februari 2010

Semantic Search Engine: Inbeta... as in "not Alpha"?

Through a discussion on LinkedIn about real life examples of semantic search, I was pointed to the existence of Inbeta. I company with a curious name because it says that the company has a "Bèta"-status. I can't image what that says about there offerings.
But now for the offerings of the company. On their product page they have many products listed.
Of course the first one caught my eye because a "semantic search engine" is something that everybody dreams of. Imagine a search engine that gives you insight and context regarding the query of the user in relation to the information at hand and maybe also on external resources by using the sematic relations between information....

But wait... Before you think I found the holy grail of search, The sentence
"Natural Language: user will not need to search for keywords anymore, our Semantic Search understands the aim of every search query and suggests results that are relevant, thus increasing cross-selling and saving customer care costs"

had my feet put on the ground again.

This proposition on using natural language as query input and giving back relevant results based on the combinaton of words that most likely exist in the available search index, is something that has been here for years. Autonomy has marketed that concept with the name Meaning Based Computing. It all revolves around the concept of terms and weights withing documents and in relation to the words in the entire index (corpus) and matching the queried words to these calculated figures.

For a serious search engine a regard this technique almost as a must have.

But, back to the semantic side of this... Where is it?

When you want a good example of what semantics can do within a search application, take look at http://www.freebase.com/view/en/barack_obama.

It has everything to do with the context of the concepts that can be derived from a query. People have roles and jobs, names can be linked to artists, historical data etc.

dinsdag 23 februari 2010

Query-time JOIN operator

Everyone who is active in the information access business knows that it is sometimes very necessary to combine the data from two resultsets into one.

Example:
The main search focusses on finding information from within a document. A document can have relations with many other datatype like geographical data like authors. Authors can have metadata themselves, like age, hobbies etc.

Now let say you want to find documents that contain the keyword "snow" and that are written by authors that have the hobby "skydiving", or you want to show the hobby of the author of a book in the result list.

For this to be searched in a search engine that doesn't have the possibility to combine the two types of information, you have two options:
1. to make this kind of search the data has to be flattened. With this we mean that all the information that can be related to a document must be indexed along with that document. This means that the fact that author X has the hobby skydiving, this must be stored with every instance where the author is X, while we already know this. This can lead to a dramatic expanding index.
2. If you want to show information from another recordset, then you have to make 1 extra query for each result in your main resultset (documents and authors) to find the hobby of an author.

In this day and age we are trying to make information more accessible and usefull by showing relations that search results have with other types of information so that users get more insight.
Especially within BI applications this functionality is needed because the type of systems that have to be connected is very divers and the data can not always be flattened for reasons of diversity or just because this would mean that lot's of information has to be duplicated...

The query-time JOIN function is very powerfull to make this possible.

Not very many search vendors have this function in there product. I know that Attivio and Exalead are capable of doing this.

vrijdag 12 februari 2010

Misconception on Google's revenues

In the article Is Google moving too far (from search) too fast? there is a misconception:

That expansion has some analysts wondering whether Google is in danger of losing focus on what made it such a profitable company, even as those same analysts say it can't rely on search as its only avenue for making money. Right now, Google relies on search for 95% of its revenue, according to Karsten Weide, an analyst with IDC.


Google does not rely on search for it's 95% revenue. They rely on advertisements (Adwords / Adsense) around there search product on the internet. They do not sell their search service, they sell space for advertisements.
Of course they are trying to get into the enterprise where they will be making money with their search product, the GSA. But for now, those sales are just a fraction of the rest.

woensdag 10 februari 2010

Autonomy acquires MicroLink

Yesterday the guys from New Idea Engineering came with the rumour of a "large search vendor" buying in on a "partner" that has big business within the intelligence industry...

Today the Guardian releases the news that Autonomy has acquired Microlink.

MicroLink is a very valued Autonomy Partner that only 2 years ago was "partner of the year" because of their big revenues on selling IDOL licenses. MicroLink is the partner that implements Autonomy (IDOL) software within American intelligence organizations.
Big business for Autonomy and strategically important.

The question of the acquisition is: "Why, Why, Why?".

This move is not in line with the buying of Interwoven or Zantaz. Those companies / technologies added a complete suite of functionalities to penetrate other markets than just the enterprise search one.

Could it be that Autonomy wants to have more influence on the implementations with those important clients? Or could it be that MicroLink has some brilliant technological invention???

dinsdag 9 februari 2010

Lucid Imagination Launched new website

A year after Lucid Imagination launched their activities they are now ready to have a face-lift.

The company has set a name for marketing, implementing and supporting open source software based on Solr/Lucene.

For what it's worth... the new site is less "techie". The former homepage has a lot of content derived from blogs and Twitter-accounts. They cut most of those feeds. A good choice in my opinion because all the discussions and "RE:" messages on the homepage didn't add any value to it.

The company clearly is positioning itself as a separate entity with it's own face and stories, apart of the Lucene and Solr community. They are now focusing more on business value of search based solutions and added new content to the site which proves that.

Rumours from NIE

Today I found a blogpost in my feeds from New Idea Engineering (one of my most favorite search consulting companies) stating that they heard of a search vendor that wants to acquire a consulting firm for it's marketing channel.
This an example of the way the search vendors (like Autonomy) are broadening their activities beyond making and selling dedicated software. The vendors start to realize that the search proposition alone is not going to get them more marketing share.
They need to start upselling and incorporate more knowledge of their customers and added value that partners are providing.