Thursday, December 8, 2011

Effective Navigation of Query Results Based on Concept Hierarchies. (Domain: Knowledge & Data Engineering).

39. Effective Navigation of Query Results Based on Concept Hierarchies. (Domain: Knowledge & Data Engineering).
Abstract:

Search queries on biomedical databases, such as PubMed, often return a large number of results, only a small subset of which is relevant to the user. Ranking and categorization, which can also be combined, have been proposed to alleviate this information overload problem. Results categorization for biomedical databases is the focus of this work. A natural way to organize biomedical citations is according to their MeSH annotations. MeSH is a comprehensive concept hierarchy used by PubMed. In this paper, we present the BioNav system, a novel search interface that enables the user to navigate large number of query results by organizing them using the MeSH concept hierarchy. First, the query results are organized into a navigation tree. At each node expansion step, BioNav reveals only a small subset of the concept nodes, selected such that the expected user navigation cost is minimized. In contrast, previous works expand the hierarchy in a predefined static manner, without navigation cost modeling. We show that the problem of selecting the best concepts to reveal at each node expansion is NP-complete and propose an efficient heuristic as well as a feasible optimal algorithm for relatively small trees. We show experimentally that BioNav outperforms state-of-the-art categorization systems with respect to the user navigation cost. We have implemented BioNav for the MEDLINE database at http://db.cse.buffalo.edu/bionav.
                                                                                               
Existing System

Existing search operation Information overload is a major problem when searching.
Biomedical databases such as PubMed, where typically a large number of citations are returned, of which only a small subset is relevant to the user.

Disadvantages of Existing system: 
  • Large number of results produced.
  • Most of the results are irrelevant to the user query.

Proposed System

The proposed system dynamically categorize SQL query results by inferring a hierarchy based on the characteristics of the result tuples. Their domain is the tuple attributes and their problem is how to organize them hierarchically in order to minimize the navigation cost. They also decide the value ranges for each attribute, for both categorical and numerical ones, and how to rank them. One of the systems takes into consideration the user’s preferences during the inference for a more personalized experience. Once the hierarchy is inferred, they follow a static navigation method. BioNav is distinct since it offers dynamic navigation on a predefined hierarchy, as is the MeSH concept hierarchy. Hence, BioNav is complementary to these systems, since it can be used to optimize the navigation, after these systems construct the navigation tree.

Advantages of Proposed System:

  • Only Relevant results are retrieved.
  • Interface and Navigation method makes user comfortable.
  • Offline and Online Query Search can be done.

Modules:

·        Query Search process module (or) Biomedical Search Systems module
·        Dynamic navigation tree module
·        Hierarchy navigation web (interface) search module
·        Query Workload online operation module


Modules DESCRIPTION:

1. Query Search process module (or) Biomedical Search Systems module

PubMed– using a keyword search interface. Currently, in an exploratory scenario where the user tries to find citations relevant to her line of research and hence not known a priori, she submits an initially broad keyword- based query that typically returns a large number of results. Subsequently, the user iteratively refines the query, if she has an idea of how to, by adding more keywords, and re-submits it, until a relatively small number of results are returned. This refinement process is problematic because after a number of iterations the user is not aware if she has over-specified the query, in which case relevant citations might be excluded from the final query result.

Query on PubMed is using the MeSH static concept hierarchy, thus utilizing the initiative of the US National Library of Medicine (NLM) to build and maintain such a comprehensive structure. Each citation in MEDLINE is associated with several MeSH concepts in two ways: (i) by being explicitly annotated with them, and (ii) by mentioning those in their text . Since these associations are provided by PubMed, a relatively straightforward interface to navigate the query result would first attach the citations to the corresponding MeSH concept nodes and then let the user navigate the navigation tree

2. Dynamic navigation tree module

Navigation tree. Figure displays a snapshot of such an interface where shown next to each node label is the count of distinct citations in the subtree rooted at that node. A typical navigation starts by revealing the children of the root ranked by their citation count, and is continued by the user expanding on or more of them, revealing their ranked children and so on, until she clicks on a concept and inspects the citations attached to it. A similar interface and navigation method is used by e-commerce sites, such as Amazon and eBay. For this example, we assume that the user will navigate to the three indicated concepts corresponding to three independent lines of research related to prothymosin



BioNav introduces a dynamic navigation method that depends on the particular query result at hand and is demonstrated in Fig The query results are attached to the corresponding MeSH concept nodes as in Fig. but then the navigation proceeds differently. The key action on the interface is the expansion of a node that selectively reveals a ranked list of descendant (not necessarily children) concepts, instead of simply showing all its children.


 3. Hierarchy navigation web (interface) search module




BioNav belongs primarily to the categorization class, which is ideal for this domain given the rich concept hierarchies (e.g., MeSH ) available for biomedical data. We augment our categorization techniques with simple ranking techniques. BioNav organizes the query results into a dynamic hierarchy, the navigation tree. Each concept (node) of the hierarchy has a descriptive label. The user then navigates this tree structure, in a top-down fashion, exploring the concepts of interest while ignoring the rest.

4. Query Workload online operation module

On-Line Operation. Upon receiving a keyword query from the user, BioNav executes the same query against the MEDLINE database and retrieves only the IDs (Pub Med Identifiers) of the citations in the query result. This is done using the ESearch utility of the Entrez Programming Utilities (eUtils) . eUtils are a collection of web interfaces to PubMed for issuing a query and downloading the results with various levels of detail and in a variety of formats. Next, the navigation tree is constructed by retrieving the MeSH concepts associated with each citation in the query result from the BioNav database. This is possible since MeSH concepts have tree identifiers encoding their location in the MeSH hierarchy, which are also retrieved from the BioNav database. This process is done once for each user query.

HARDWARE REQUIREMENTS:

           Processor                                 -    Pentium –III
Speed                                       -    1.1 Ghz
RAM                                        -    256  MB(min)
Hard Disk                                -   20 GB
Floppy Drive                           -    1.44 MB
Key Board                               -    Standard Windows Keyboard
Mouse                                      -    Two or Three Button Mouse
Monitor                                    -    SVGA

 

SOFTWARE REQUIREMENTS:


v   Operating System                   :           Windows95/98/2000/XP
v   Application  Server                 :           Tomcat5.0/6.X                       
v   Front End                                :           J2EE - (HTML, Java, Jsp, Servlet )
v    Scripts                                    :           JavaScript.
v   Development tool                   :           Net beans 6.0.1
v   Build tool                                :           Ant
v   Server side Script                    :           Java Server Pages.
v   Database                                 :           MsAccess
v   Database Connectivity            :           JDBC.

REFERENCE:

Abhijith Kashyap, Vagelis Hristidis, Michalis Petropoulos, and Sotiria Tavoulari, “Effective Navigation of Query Results based on Concept Hierarchies”, IEEE Transactions on Knowledge and Data Engineering, Vol. 23, No.4, April 2011.


No comments:

Post a Comment