hakia-Lab
OntoSem
QDEX
SemanticRank
Dialogue


R&D Team
about us
hakia search
hakia CLUB
Meet Others



Private Access hakia personnel
Login

Query Detection and Extraction - QDEX - System

QDEX is a new way to analyze Web pages and store Web pages' content in terms of knowledge bits. It is a replacement of the inverted index method most commonly used today. The need for such replacement emerges when semantically rich data must be handled at high speeds for Web search. hakia has invented the QDEX system to meet this challenge.



As depicted above, the QDEX system analyzes the entire content of a Web page (including HTML). Then, the QDEX algorithm extracts all possible queries that can be asked to this content, at various lengths and forms. These queries (sequences) become gateways to the originating documents, paragraphs and sentences during the retrieval mode. Note that this is done off-line before any actual query is received from a user.

The advantage of this approach is that decomposing content in this way provides a great flexibility in a search engine platform for utilizing semantically rich data and multiple-thread processing of equivalent queries. Otherwise, deep semantic analysis is virtually impossible over a vast amount of textual data.

An inverted index, for example, has a huge "active" data set (on stand-by) prior to a query from the user. Thus, enriching this data set with semantic equivalences (concept relations) will further increase the operational burden in an exponential manner. QDEX, on the other hand, has a tiny active set for each query (silver ball in the mosaic picture) and semantic associations can be easily handled on-the-fly.

The critical point in QDEX system is to be able to decompose sentences into a handful of meaningful sequences without getting lost in the combinatory explosion space. For example, a sentence with 8 significant words can generate over a billion sequences (of 1, 2, 3, 4, 5, and 6 words) where only a few dozen makes sense by human inspection. Thus, the challenge is how to reduce billion possibilities into a few dozen that make sense. hakia uses OntoSem technology to meet this challenge.



Copyright © 2007, 2008 hakia, Inc.