EFFICIENT PROCESSING OF TOP–KDOMINATING QUERIESON INCOMPLETE DATA
DOI:
https://doi.org/10.5281/ijcset.v3i4.49Abstract
Data mining is a powerful way to discover knowledge within the large amount of the data. Incomplete data is general, finding out and querying these type of data is important recently. The top-k dominating (TKD) queries return k objects that overrides maximum number of objects in a given dataset. It merges the advantages of top-k queries. Traditional query processing techniques that focus on the strict soundness of answer tuples often ignore tuples with critical missing attributes, even if they wind up being relevant to the user query. Ideally, the mediator is expected to retrieve such relevant uncertain answers and gauge their relevance by accessing their likelihood of being relevant answers to the query. The autonomous nature of the databases poses several challenges in realizing this idea. Such challenges include restricted access privileges, limited query patterns and cost sensitivity of database and network resource consumption in web environment. This thesis presents QPIAD – a framework for query processing over incomplete autonomous databases. QPIAD is able to retrieve relevant uncertain answers with high precision, high recall and manageable cost. Data integration over multiple autonomous data sources is an important task performed by a mediator. Extended experimental evaluation using both real and synthetic datasets shows the effectiveness of the developed
pruning rules and confirms performance of algorithms.