donderdag 11 maart 2010

Excellent clarification on semantic search

Today I received an update on one of te discussions on Semantic search on LinkedIn. Charlie Hull put up an excellent example on how semantic search works. This has to do with the capabilities of the search technology that is used in a specific situation, but also with the fact that a search application has to engage in a dialog with the user to assess his meaning or context. This has to do with the fact that most users just use 1 to 3 words to formulate a query. There's not much you can do with such a query in the first try. But... the search application has to pick up on those keywords and try to make something out of it.
The next step is to try to ask the user what he means.

Semantic search technology - does it actually exist? 33 comments »

Started by Charlie Hull

At Expert System we have been building semantic search systems for 20 years. Here is what we learned in serving 100's of corporate customers. A semantic search system must establish and store the CONTEXT of content. Then you need an interface to choose the CONTEXT you would like so a match can be made.

Establishing CONTEXT means the following processes must be followed. 1). word morphology (e.g. stems), 2). word roles (e.g. nouns, verbs, etc.) 3). word logic (e.g. subject - verb - object reduction) and 4). sense disambiguation (e.g. assignment of a definition for each word based on the best fit from available alternatives and in the context of the rest of the sentence(s). All 4 of these methods require the use of a semantic network that is both broad - covers the majority of the language to be used and deep - has many ways in which words relate to one another.

With the above approach will you reach a precision ("accuracy") and recall ("completeness") in search beyond the 80% mark. With further customization a 90% mark is easily achieved. Systems that rely on statistical / heuristic methods typically fall far short of these benchmarks. This is true since statistical / heuristic methods cannot fully establish logic and disambiguation.

Finally the interface must be constructed in a way that allows the user to tell the system what CONTEXT the query is in. Full natural language questions using the above methods can do this automatically. But the reality is we live in a 1-3 query word world. So allowing the user to select the word sense of one or more of the query words gives the system much more "to chew on" and is not generally an intrusion for the user. Similar user interface interactions include showing categorical, domain, people, places, organization outcomes from a search which are clickable, showing lists of semantic triples (subject-predicate-object) from which to choose, etc. All of these are at most 1-2 more clicks than a normal keyword based search but improve the experience immensely.

Such interfaces also allow what we call a 3-step walk through search where step 1 is about precision - less of a list, step 2 is an expansion of concepts - to include things related but that you did not know about, and step 3 another step of precision. This "ratcheting" effect therefore begins to bring into the Enterprise Search function other important aspect of corporate work like discovery, exploration and analysis. http://www.expertsystem.net By Brooke Aker

Geen opmerkingen: