TECHNIQUES FOR OPTIMIZING THE QUERY PERFORMANCE OF DISTRIBUTED XML DATABASE - NAHID NEGAR

  • View
    215

  • Download
    0

Embed Size (px)

Text of TECHNIQUES FOR OPTIMIZING THE QUERY PERFORMANCE OF DISTRIBUTED XML DATABASE - NAHID NEGAR

  • Slide 1
  • TECHNIQUES FOR OPTIMIZING THE QUERY PERFORMANCE OF DISTRIBUTED XML DATABASE - NAHID NEGAR
  • Slide 2
  • PROBLEM STATEMENT EXPLORING THE RESEARCH SCOPE FOR IMPROVING THE PERFORMANCE OF THE DISTRIBUTED QUERY PROCESS FOR XML DATABASE. THE RESEARCH PAPER DESCRIBES: THE ISSUES AND CONSIDERATIONS FOR DISTRIBUTED XML QUERY PROCESSING. EXPLORING CLASSICAL QUERY OPTIMIZATION TECHNIQUES PRESENTING SIMILAR RESEARCH WORK DONE BY OTHERS. ANALYZED THE RESEARCH SCOPE AND DIRECTIONS.
  • Slide 3
  • DISTRIBUTED XML DATABASE XML FILES ARE IDEAL FOR DESCRIBING SEMI STRUCTURED DATA. WITH THE INCREASE AMOUNT OF DATA, THE XML DATABASES ARE EXPANDED [1][1] STORAGE OF A LARGE NUMBER OF XML FILES PRESERVING THE HIERARCHICAL FORMAT. DATA IS DISTRIBUTED OR FRAGMENTED IN DIFFERENT LOCATIONS, CAN BE EVEN DIFFERENT GEOGRAPHIC LOCATION. DATA INTEGRATION IS NEEDED WHEN PROCESSING A QUERY ON DISTRIBUTED DATABASE [2].[2]
  • Slide 4
  • WHY DISTRIBUTED XML DATABASE IS NEEDED [6][6] LOWER COSTS INCREASED SCALABILITY INCREASED AVAILABILITY DISTRIBUTION OF SOFTWARE MODULES NEW APPLICATIONS BASED ON DISTRIBUTION MARKET FORCES
  • Slide 5
  • XML DATABASE AND QUERY PROCESSING XML DDL DTD XML SCHEMA - XSD XML DML XML QUERY LANGUAGES (EXAMPLE XQUERY) ATTRIBUTES OF XML DATABASE: MULTIPLE LEVELS OF VALIDITY ENTITIES AND URI TRANSFORMATIONS
  • Slide 6
  • DISTRIBUTED XML QUERY PROCESSING CONSIDERATIONS [7][7] ARCHITECTURE OF DISTRIBUTED QUERY PROCESSING SYSTEMS CENTRALIZED VS. DISTRIBUTED PROCESSING OF DISTRIBUTED QUERY STATIC VS. DYNAMIC QUERY PROCESSING DATA VS. QUERY SHIPPING
  • Slide 7
  • DISTRIBUTED XML QUERY PROCESSING ISSUES [7][7] DIFFERENT QUERY PROCESSING CAPABILITIES OF THE DATA SOURCES UNAVAILABILITY OF STATISTICAL INFORMATION ON THE DATA SOURCES UNRELIABLE RESPONSE TIMES DATA REDUNDANCY TIME TO LAST VS. TIME TO FIRST ELEMENT
  • Slide 8
  • POPULAR PERFORMANCE IMPROVEMENT TECHNIQUE FOR DISTRIBUTED XML QUERY [6][6] SELECTIVITY: FACILITATE QUERY PLANNER WITH ABILITY OF SELECTIVITY ESTIMATION SELECTION PUSHDOWN: PERFORM SELECTIONS AS SOON AS POSSIBLE IN THE QUERY TREE INCREMENTAL UPDATES: THE MATERIALIZED VIEW IS UPDATED TO REFLECT THE CHANGES VIEW QUERYING: QUERIES CAN BENEFIT FROM EXPLOITING EXISTING MATERIALIZED VIEWS QUERY CONTAINMENT: FIND THE COMMON SUB-QUERIES AND EXECUTE THOSE JUST ONCE
  • Slide 9
  • APPROACHES TAKEN BY OTHERS AN OPTIMIZING QUERY PROCESSING WITH AN EFFECTIVE CACHING MECHANISM FOR DISTRIBUTED DATABASE [5][5] EFFICIENTLY PROCESSING XML QUERIES OVER FRAGMENTED REPOSITORIES WITH PARTIX [8][8] A METHODOLOGY FOR QUERY PROCESSING OVER DISTRIBUTED XML DATABASES [4][4] SCALABLE AND DISTRIBUTED PROCESSING OF SCIENTIFIC XML DATA [3][3]
  • Slide 10
  • AN OPTIMIZING QUERY PROCESSING WITH AN EFFECTIVE CACHING MECHANISM FOR DISTRIBUTED DATABASE [5][5] DATABASE OPTIMIZATION FRAMEWORK HAS BEEN DESCRIBED. THE SQL STATEMENT CONTAINS ELEMENTS WHICH IS ACCEPTED BY AN XML ORIENTED COMMON DATA. A HISTORICAL DATABASE AND QUERY BASED CACHE REPLACEMENT HAS BEEN USED. AN XML DATABASE SYSTEM IS SUITABLE FOR THE IMPLEMENTATION OF DATA ANALYSIS APPLICATION. A COMMON OPTIMIZATION QUERY PROCESSING MODEL IS ALSO USED.
  • Slide 11
  • EFFICIENTLY PROCESSING XML QUERIES OVER FRAGMENTED REPOSITORIES WITH PARTIX [8][8] THE DATA VOLUME OF XML REPOSITORIES AND THE RESPONSE TIME OF QUERY PROCESSING HAVE BECOME AS CRITICAL ISSUES. THE TRADITIONAL FRAGMENTATION DEFINITIONS DON NOT DIRECTLY USE FOR XML DOCUMENTS. HIGH PERFORMANCE OF XML DATA SERVERS IS FOCUSED. PATRIX IS USED FOR EXPERIMENT.
  • Slide 12
  • A METHODOLOGY FOR QUERY PROCESSING OVER DISTRIBUTED XML DATABASES [4][4] THE METHODOLOGY FOR XQUERY QUERY PROCESSING OVER DISTRIBUTED XML DATABASES. THE TECHNIQUE CAN BE USED IN AN XML DATABASE WHICH ALLOWS FRAGMENTATION AND HOMOGENEOUS XML DATABASES. AN ARCHITECTURE BASED MEDIATOR WITH ADAPTORS ATTACHED TO REMOTE DATABASES IS PROPOSED. THREE TYPES OF FRAGMENTATION SUCH AS HORIZONTAL, VERTICAL AND HYBRID WERE USED FOR SEVERAL EXPERIMENTS.
  • Slide 13
  • SCALABLE AND DISTRIBUTED PROCESSING OF SCIENTIFIC XML DATA [3][3] THE BIG DATA TECHNIQUE IN XML METADATA INDEXING FOR DISTRIBUTED XML DATABASE. THE MAPREDUCE PROCESSING IS INCORPORATED. THE DATASET PROCESSING IS A CRITICAL TO ENSURE EFFECTIVE USE. AN AUTOMATED PROCESS CAN BE HELPFUL. THIS PAPER TESTED THE PERFORMANCE RESULTS USING TWO MAPREDUCE IMPLEMENTATIONS, APACHE HADOOP AND LEMO-MR.
  • Slide 14
  • RESEARCH SCOPE IN DISTRIBUTED XML QUERY PROCESSING PERFORMANCE STRUCTURED-NESS HOW TO DETERMINE THE STRUCTURE AND THE INDEXES. SCHEMA HETEROGENEITY HOW TO INTEGRATE HETEROGENEOUS SCHEMA. RELATION DEFINITION HOW TO DEFINE RELATIONS AND COMPARISON BETWEEN XML ELEMENTS DATA SOURCE PROCESSING POWER - HOW TO DO DISTRIBUTED QUERY PROCESSING PLANNING ANSWER QUALITY HOW TO PRODUCE AND VERIFY THE BEST RESULT. ANSWERING SPEED HOW TO KEEP DB STATISTICS AND IMPROVE OPERATIONS. DATA SOURCE AND USER QUANTITY PARALLEL QUERY PROCESSING ALGORITHM.
  • Slide 15
  • CONCLUSION XML IS A HIGHLY ACCEPTABLE FORMAT TO STORE DATA AND IS WIDELY USED WITH THE LARGE AMOUNT OF DATA PRODUCED FROM DIFFERENT LOCATION, A DISTRIBUTED XML DATABASE IS OFTEN USED. IT IS IMPORTANT TO MAINTAIN A REASONABLE PERFORMANCE FOR QUERY PROCESSING IN DISTRIBUTED DATABASE. THE GOAL OF THE PAPER IS TO, IDENTIFY THE RESEARCH SCOPE FOR DISTRIBUTED XML QUERY PROCESSING PERFORMANCE IMPROVEMENT.
  • Slide 16
  • REFERENCES 1. G. FIGUEIREDO, V. BRAGANHOLO, M. MATTOSO.PROCESSING, "PROCESSING QUERIES OVER DISTRIBUTED XML DATABASES." JOURNAL OF INFORMATION AND DATA MANAGEMENT,1(3):455-470, OCTOBER 2010. 2. A. M. KULKARNI, J. THIRUNAVUKKARASU, P. S. PILLAI, S. S. SULEGAI, S. RAO "INSERTION AND QUERYING MECHANISM FOR A DISTRIBUTED XML DATABASE SYSTEM" IN: PROCEEDINGS OF THE 5TH ACM COMPUTE 3. E. DEDE, Z. FADIKA, C. GUPTA, M. GOVINDARAJU, "SCALABLE AND DISTRIBUTED PROCESSING OF SCIENTIFIC XML DATA", 2011 12TH IEEE/ACM INTERNATIONAL CONFERENCE ON GRID COMPUTING (GRID), VOL., NO., 4. G. FIGUEIREDO1, V. BRAGANHOLO2, M. MATTOSO1, "A METHODOLOGY FOR QUERY PROCESSING OVER DISTRIBUTED XML DATABASES" PROGRAMA DE ENGENHARIA DE SISTEMAS E COMPUTAR IM/UFRJ, BRAZIL 5. S. PRABHA, A.KANNAN, P.A. KUMAR, "AN OPTIMIZING QUERY PROCESSING WITH AN EFFECTIVE CACHING MECHANISM FOR DISTRIBUTED DATABASE" 6. DONALD KOSSMANN, "THE STATE OF THE ART IN DISTRIBUTED QUERY PROCESSING," ACM COMPUTING SURVEYS, VOL. 32, NO. 4, 2000, PP. 422-469. 7. M. SMILJANI, H. BLANKEN, M V. KEULEN, W. JONKER, "DISTRIBUTED XML DATABASE SYSTEMS" 8. R. ANDRADE, G. RUBERG, A. BAIAO, V. BRAGANHOLO, AND M. MATTOSO. PARTIX: PROCESSING XQUERY QUERIES OVER FRAGMENTED XML REPOSITORIES. TECHNICAL REPORT ES-691, DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING - COPPE/FEDERAL UNIVERSITY OF RIO DE JANEIRO, BRAZIL, DEPARTMENT OF APPLIED INFORMATICS - UNIRIO, BRAZIL, DEC. 2005 9. J. SMITH AND P. WATSON. FAULT-TOLERANCE IN DISTRIBUTED QUERY PROCESSING. IN 9TH INTERNATIONAL DATABASE ENGINEERING AND APPLICATION SYMPOSIUM, 2005. IDEAS 2005., PAGES 329 338, JULY 2005.