Pradip Raj Poudel (149-44), Kashiram Pokharel(149-40)
“QUERY OPTIMIZATION IN DISTRIBUTED DATABASE”
ME_CE III NCIT, Lalitpur
A Review ArticleBy: Yasmeen Rm
Umar Amit R Welekar05/01/23
1
Outline:
Abstract Introduction Query Optimization Optimization
Challenges Steps In Query
Processing
S. Chaudhuri Review
Fan/Xifeng Review Chen/YU Review Kossman/Stocker
Review XUE Lin Review Conclusion
First Part Second Part
05/01/23
2
Abstract:
Data is Growing over Distributed Environment, Day By Day so Better Distributed DBMS is Required.
Multiple sites with parts of Data’s ,so Query optimization is a challenges in Distributed Database.
Query optimization finds the best execution plan from various options.
05/01/23
3
Introduction
All Data Placed on Central Computer location so Easy to Access/Extract.
DB Query Easily Transformed Into RA operations.
No overhead
Data on multiple Sites but centrally Administrated.
Provides Flexibility/customization.
Ex. Location A can Access data From location B.
Location Transparency Data Distributed, so
complex for Query Transformation
Centralized Database Distributed DatabaseDatabase: Collection of Files/Tables.
DBMS: Manage Database( CD or DD)
05/01/23
4
Query Optimization:
Data Distributed Over Different Sites in Distributed Database.
If Query is Given, the response of that query may Requires data From several Sites.(DBMS fxn)
Now the Major task is “ Process A query with location transparency and Find out Best Sensible Execution Plan”.
Objective:
05/01/23
5
Optimization Challenges:
1st Break Query in Distributed Database Environment.
2nd Determine which Sites has less Data/records.As less Data ,less Communication and Vice-versa.
Then Transfer those Data to Another Site.More Sites= More Complex/Complication to Process query.
Compute Cost using Effective Cost Module.
As Data Distributed in Different Sites, More Challenges To Compute Efficient Query Plan.
05/01/23
6
Basic Steps In Query Processing Plana). Query Decomposition:Decompose into Simpler Form of RA.
OPTIMIZER COMPONENTS:a) . Query Engineb) . Query Optimizer
b). Data localization: Data Referenced to only one location.(One Site)c). Global Optimization:Optimization of RA/Decision MakingEx. Which site is efficient to move data and where query will Execute.
d). Local Optimization: When the Query Fragmented To sites ,treat locally and Execute Query.
05/01/23
7
Optimizer Components: Query Engine:a). Produce O/P by taking I/P and Performs Operations By taking Physical operators( Join,Sort,Loop).b). Construct Parse tree which shows flow of Data from One Operation to Another Operation.
Query Optimizer:a). Receives Parse Tree As I/P From QE and Produce Best Possible Execution Plan ,Based On least Resource Consumption.b). Not a Easy task to generate Efficient Query Plan
05/01/23
8
Review
Chaudhari Discussed on Basic Query Optimization/Search Space/Cost Estimation Technique.
Operator Tree having least resources consumption would be best.
For Selecting Best plan, Statistical Info and Execution cost Analyzed.
Statistical : No of Rows,memory,Joins,Pages etc.
1. Surajit Chaudhari : Review
05/01/23
9
Review:
DD: Multiple Computer With Network. GDBMS,LDBMS/CM are Elements of DB.
Distributed Database Manager is global and local.
Proposed algorithm to improve semi-connected sub query optimization to reduce Network Cost.But less efficient For Select Query.
2.Fan/XiFeng : Review
05/01/23
10
Review:
More Focused on Communication Cost. Focused on Detail Study of Join/Semi join
Query. The combination of Join & Semi join Results in
Large Reduction of Communication Cost. Determines effect of join operation and find
out best combination of join which reduces communication cost.
3.Chen/Yu: Review
05/01/23
11
Review:
Proposed Algorithm Based on IDP( iterative Dynamic Programming)
Good But difficult to apply incase of Complex queries.
Thus ,Uses Greedy Algorithm + DP concept used For best Query plans.
Memory Requirements not Considered.
4.Kossmann/Stocker :Review
05/01/23
12
Review:
User Module: Analyze User Query Syntax Analysis Module: done on Global Query Query tree Conversion Module Optimization Module: receives query tree which is
optimized and creates physical trees and calculates cost of each physical operator tree.
Order Processing Module: Distribute Query to Server & Returns result to user.
Local Data Dictionary used but table /cpu time/memory increases.
5.XUE Lin: Review
05/01/23
13
Conclusion:
Dynamic Programming/Greedy: Large Space Complexity.
Thus New Approach Used Based On Ant Colony Algorithm, Where Each Relation is Considered as Domain Value.
Better Execution Time has Been Achieved.
05/01/23
14