66. Efficient Multi-dimensional Fuzzy Search for Personal Informatio. (Domain: Knowledge & Data Engineering)
ABSTRACT:
With the explosion in the amount of semi-structured data users access and store in personal information management systems, there is a critical need for powerful search tools to retrieve often very heterogeneous data in a simple and efficient way. Existing tools typically support some IR-style ranking on the textual part of the query, but only consider structure (e.g., file directory) and metadata (e.g., date, file type) as filtering conditions. We propose a novel multi-dimensional search approach that allows users to perform fuzzy searches for structure and metadata conditions in addition to keyword conditions. Our techniques individually score each dimension and integrate the three dimension scores into a meaningful unified score. We also design indexes and algorithms to efficiently identify the most relevant files that match multi-dimensional queries. We perform a thorough experimental evaluation of our approach and show that our relaxation and scoring framework for fuzzy query conditions in noncontent dimensions can significantly improve ranking accuracy. We also show that our query processing strategies perform and scale well, making our fuzzy search approach practical for every day usage.
Existing System:
File mapping or file lookup is critical in decentralizing metadata management within a group of metadata servers. Following approaches are used in the Existing system.
1. Table-Based Mapping : It fails to balance the load.
2. Hashing-Based Mapping : It has slow directory operations, such as listing the
directory contents And renaming directories .
3. Static Tree Partitioning : Cannot balance the load and has a medium lookup
time.
4. Dynamic Tree Partitioning : Small memory overhead, incurs a large migration
overhead.
Proposed System:
Here we are using the new approaches called HIERARCHICAL BLOOM FILTER ARRAYS (HBA), efficiently route metadata request within a group of metadata servers. There are two arrays used here. First array is used to reduce memory overhead, because it captures only the destination metadata server information of frequently accessed files to keep high management efficiency. And the second one is used to maintain the destination metadata information of all files. Both the arrays are mainly used for fast local lookup.
Hardware Requirements
• SYSTEM : Pentium IV 2.4 GHz
• HARD DISK : 40 GB
• FLOPPY DRIVE : 1.44 MB
• MONITOR : 15 VGA colour
• MOUSE : Logitech.
• RAM : 256 MB
• KEYBOARD : 110 keys enhanced.
Software Requirements
• Operating system :- Windows XP Professional
• Front End :- Microsoft Visual Studio .Net 2005
• Coding Language :- C# 2.0
• Database :- SQL SERVER 2000
Modules
- Login
- Finding Network Computers
- Meta Data Creation
- Searching Files
Module Description
Login
In Login Form module presents site visitors with a form with username and password fields. If the user enters a valid username/password combination they will be granted access to additional resources on website. Which additional resources they will have access to can be configured separately.
Finding Network Computers
In this module we are going to find out the available computers from the network. And we are going to share some of the folder in some computers. We are going to find out the computers those having the shared folder. By this way will get all the information about the file and we will form the Meta data.
Meta Data Creation
In this module we are creating a metadata for all the system files. The module is going to save all file names in a database. In addition to that, it also saves some information from the text file. This mechanism is applied to avoid the long run process of the existing system.
Searching Files
In this module the user going to enter the text for searching the required file. The searching mechanism is differing from the existing system. When ever the user gives their searching text, It is going to search from the database. At first, the search is based on the file name. After that, it contains some related file name. Then it collects some of the file text, it makes another search. Finally it produces a search result for corresponding related text for the user.
REFERENCE:
Wei Wang, Christopher Peery, Amelie Marian, Thu D. Nguyen, “Efficient Multi-dimensional Fuzzy Search for Personal Information Management Systems”, IEEE Transactions on Knowledge and Data Engineering, 2011.
No comments:
Post a Comment