17
Active-Active ecosystem at Janardh Bantupalli Senior Staff Database Engineer Sai Sundar Director, Database Engineering

Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

Active-Active ecosystem at

Janardh Bantupalli Senior Staff Database Engineer Sai Sundar Director, Database Engineering

Page 2: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

❏ Oracle Active-Active Replication

❏ Incremental Capture and ETL: Data and Schema

❏ Reliability Infrastructure: Metrics and Monitoring

❏ Data Integrity Validation: Realtime Data Audit

❏ Q & A

Page 3: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

Active-Active @ LinkedIn

Page 4: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

❏ Active-active configuration with DDL replication

❏ Columns: gg_modi_ts and gg_status, gg_priority (in

few cases)

❏ Before triggers to populate timestamp values

❏ Default LWW conflict resolution; priority resolution in

few cases

Active-Active: Design

Page 5: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

❏ Non-overlapping primary keys across DCs

❏ Hard deletes to soft delete conversion

❏ Row re-birth prevention, cleanup of soft deletes

❏ Update-to-insert conversion; full row capture

❏ Scaling using parallelism; deadlock mitigation

Active-Active: Features

Page 6: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

❏ OGG process offloading

❏ Data encryption and compression

❏ Parallel apply and eventual consistency

❏ Foreign key, unique key, novalidate constraint

handling

Active-Active: Features (continued)

Page 7: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

❏ OGG Big Data (Kafka) Adapter

❏ Pluggable kafka client

❏ Table-level granularity for source data

❏ Kafka topic per schema; partitioning for

parallelism

Incremental Data Pipeline

Page 8: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

❏ Generic payload schema; data and

schema to different topics

❏ Independent framework to propagate

schema changes (DDLs, metadata)

❏ Integration with release framework;

API for on-demand invocation

Data and Schema Propagation

Page 9: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

Big Picture: Data Pipeline

Page 10: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

Reliability Infrastructure

❏ Robust tooling within active-active, data pipeline

components ❏ Custom: Realtime Audit & APIs, OGG Monitor, Schema API

❏ LinkedIn Stack: CFEngine, Ingraphs, Auto Alerts, Iris

❏ Redundancy (HA) in data audit and ETL pipelines;

integration with schema deployment framework

❏ Proactive monitoring and detection of problematic

load patterns; auto-recovery in most cases

Page 11: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

❏ Generic framework for data quality in distributed data environments

❏ Near-realtime data validation and discrepancy detection across

data centers (data sources)

❏ Incremental data input from sources -- full (or) partial data set; audit

based on last modified timestamp

❏ Provides refined discrepancy analysis if full data set is available

Realtime Data Audit

Page 12: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database
Page 13: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

Data Processor

Page 14: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

Audit Processor

Page 15: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

Audit Framework : Features

➔ Deduplication and Checkpointing; recovery

from failures and replay

➔ Data bucketing window customization to allow

more deduplication

➔ Synchronization with replication and source

data watermarks

Page 16: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

Active-Active Ecosystem

OGG-Oracle for multi datacenter active-active

OGG Big Data Adapter for Incremental Data and ETL

Realtime Data Audit Framework for Data Quality

Schema and Metadata Propagation for ETL

Page 17: Active-Active ecosystem at - nocoug.orgnocoug.org/.../NoCOUG_201805_Bantupalli_Sundar_Active_Active_Ecosystem.pdf · Active-Active ecosystem at Janardh Bantupalli Senior Staff Database

Thank You! https://www.linkedin.com/in/janaonline

https://www.linkedin.com/in/ssaisundar

Thank You!