eScholarship Repository eScholarship Repository California Digital Library
eScholarship > LBNL > Paper LBNL-904E

LBNL Papers

LBNL Website

Policies

Search LBNL

Submit a Paper

Notify me of new papers

institute_logo

Lawrence Berkeley National Laboratory
University of California

LBNL Papers  •  LBNL Website  •  Policies  •  Search LBNL  •  Submit a Paper

Improving Estimation Accuracy of Aggregate Queries on Data Cubes
Elaheh Pourabbas, LBNL
Arie Shoshani, LBNL

Download the Paper (286 K, PDF file) - September 25, 2008 Tell a colleague about it.
Printing Tips: Select 'print as image' in the Acrobat print dialog if you have trouble printing.

ABSTRACT:

In this paper, we investigate the problem of estimation of a target database from summary databases derived from a base data cube. We show that such estimates can be derived by choosing a primary database which uses a proxy database to estimate the results. This technique is common in statistics, but an important issue we are addressing is the accuracy of these estimates. Specifically, given multiple primary and multiple proxy databases, that share the same summary measure, the problem is how to select the primary and proxy databases that will generate the most accurate target database estimation possible. We propose an algorithmic approach for determining the steps to select or compute the source databases from multiple summary databases, which makes use of the principles of information entropy. We show that the source databases with the largest number of cells in common provide the more accurate estimates. We prove that this is consistent with maximizing the entropy. We provide some experimental results on the accuracy of the target database estimation in order to verify our results.

SUGGESTED CITATION:
Elaheh Pourabbas and Arie Shoshani, "Improving Estimation Accuracy of Aggregate Queries on Data Cubes" (September 25, 2008). Lawrence Berkeley National Laboratory. Paper LBNL-904E.
http://repositories.cdlib.org/lbnl/LBNL-904E

 
bar
Open Archives Initiative eScholarship is a service of the California Digital Library bepress