21
·Click to add text © 2011 IBM Corporation Integrated Data Management Managing Large Data Sets for Smart Meters Jacques Roy, IBM [email protected]

Ugif 12 2011-smart meters-11102011

  • Upload
    ugif

  • View
    566

  • Download
    3

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Ugif 12 2011-smart meters-11102011

·Click to add text

© 2011 IBM Corporation

Integrated Data Management

Managing Large Data Sets for Smart Meters

Jacques Roy, IBM [email protected]

Page 2: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

2

Overview

• How much data do smart meters generate?

• Introducing Informix TimeSeries benefits

• Some technical details

• Proof points

• Solution partners

• Demo

Page 3: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

3

Utility-Scale Smart Meter Deployments, Plans &

Proposals* September 2010

© 2010 The Institute for Electric Efficiency

*This map represents smart meter deployments,

planned deployments, and proposals by investor-

owned utilities and large public power utilities.

http://www.edisonfoundation.net/IEE/

Deployment for

>50% of end-users

Deployment for

<50% of end-users

Page 4: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

4

Example of Changing Storage Requirements

Monthly Meter

Reads

Daily Meter

Reads 15 Minute Meter

Reads

350.4B

3.65B

120M

Changing Workloads For 10 Million Smart Meters:

Now – Each meter read once per month

Very soon – Each meter read once every 15 minutes (2920X)

Regulations – Need to keep data on line for 3 years (PUC) and, perhaps, save for 7 years

# of Records

Per Year for

10M meters

Data for a Utility In California

Frequency

of reads

Page 5: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

5

Keeping up with Smart Meter Data

Large amounts of data causes problems in 2 areas:

A) Storage management

• Will get expensive and cumbersome to maintain

• Query performance

• Compliance Reports must be completed before the end of each day

• Customer portal queries must be handled in a timely manner

• Customer billing

Page 6: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

6

Introducing IBM Informix Relational Database

• Why IBM Informix?

• Low cost/Low administration

• Manage thousands of servers with one database administrator

• Many installations have no DBA’s

• High performance

• Insert 100’s of thousands of records per second

• Analytic functions not available in other database products

• High Availability

• Scale up, scale out, and disaster recovery

• Native time series data support

• Load and analyze time series data

“The idea was to run the project in three cycles, and use a different type of analysis in each cycle, to see if we

could draw any conclusions in terms of how to promote change in the way people view their energy

consumption. With help from the IBM Hursley team, we quickly found that Informix TimeSeries could deliver

spectacular results.” Clive Eisen, Chief Technology Officer at Hildebrand.

Page 7: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

7

Key Strengths of Informix TimeSeries

• Performance

• Extremely fast data access

• Data clustered and sorted by time on disk to reduce I/O

• Handles operations hard or impossible to do in standard SQL

• Continuous Real-Time and Batch Data Loaders

• Space Savings

• Typically saves 50% space over standard relational layout

• Toolkit approach

• Allows users to develop their own algorithms to run in the database

• Algorithms running in the database leverage the buffer pool for speed

• Easier

• Conceptually closer to how users think of time series

Page 8: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

8

1 1-1-11 12:00 Value 1 Value 2 …….. Value N

2 1-1-11 12:00 Value 1 Value 2 …….. Value N

3 1-1-11 12:00 Value 1 Value 2 …….. Value N

… … … … …….. …

1 1-1-11 12:15 Value 1 Value 2 …….. Value N

2 1-1-11 12:15 Value 1 Value 2 …….. Value N

3 1-1-11 12:15 Value 1 Value 2 …….. Value N

… … … … …….. …

Typical Relational Schema for Smart Meters Data

Smart_Meters Table

Index

•Each row contains exactly one record = billions of rows

•Additional indexes are required for efficient lookups

•Data is appended to the end of the table as it arrives

•Meter Id’s stored in every record

•No concept of a missing row

Ta

ble

Gro

ws

KWH Voltage ColN Time meter_id

Page 9: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

9

Same Table using an Informix TimeSeries Schema

1 [(1-1-11 12:00, value 1, value 2, …, value N), (1-1-11 12:15, value 1, value 2, …, value N), …]

2 [(1-1-11 12:00, value 1, value 2, …, value N), (1-1-11 12:15, value 1, value 2, …, value N), …]

3 [(1-1-11 12:00, value 1, value 2, …, value N), (1-1-11 12:15, value 1, value 2, …, value N), …]

4 [(1-1-11 12:00, value 1, value 2, …, value N), (1-1-11 12:15, value 1, value 2, …, value N), …]

… …

•Each row contains a growing set of records = one row per meter

•Data append to a row rather than to the end of the table

•Meter Ids not stored in individual records

•Data is clustered by meter id and sorted by time on disk

•Missing values take no disk space, missing interval reads take 2 bytes

Smart_Meters Table

Table grows

meter_id Series

Page 10: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

10

Informix TimeSeries: Key Concepts

• DBSpace

• A logical unit of storage used by the database server

• Made up of one or more physical storage units called chunks

• Containers

• Specialized storage for TimeSeries in dbspaces

• TimeSeries data element: row type

• Flexibility to define as many parts as needed

• TimeSeries types: regular, irregular

• Covers regular intervals and sparse data distribution

• Calendar

• Defines business patterns

• Relational view: VTI and processing return type

Page 11: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

11

Informix TimeSeries: Space savings

• Example of simple meter record: {loc_id, timestamp, register_1, register_2, register_3}

8 bytes 12 bytes 4 bytes 4 bytes 4bytes

• TimeSeries savings: no duplication of loc_id, timestamp

• Record size: 12 bytes instead of 32 bytes

• Impact: 10M meters, 3 years, 15-minute interval

• Savings: ~20TB

20 bytes * 10M meters * 96 intervals/day * 3 years * 365 days/year

• Index on meter records:

• Relational: 1 trillion entries on loc_id, timestamp

• TimeSeries: 10M entries on loc_id

Page 12: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

12

Informix TimeSeries: I/O advantages

• Less data to read, less index entries to navigate

• Meter data grouped on pages

• Reading 1 day of data → One I/O (+ index I/O)

• Relational: no grouping guarantees, could be 96 I/Os (+ index I/O)

Page 13: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

13

Informix TimeSeries: Processing advantages

• TimeSeries data is accessed in time order

• Relational model has no order guarantees – Sorting required

• Library of SQL functions to process TimeSeries

• 100+ functions

• Ex: Aggregate data from 15-min interval to daily interval

Calculate a moving average

• Ability to write custom functions

• Write business-specific processing

• Streamline the processing instead of generic functions

• Can greatly reduce code length → Significant performance

improvements

Page 14: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

14

Informix TimeSeries vs Relational Database Systems Comparison of 1M Meter POC at Large US Energy Provider

Disk

Storage

Space

Relational

Database

IBM Informix

TimeSeries

Relational Database System

Processing

Time

(Hours)

Data Loading Reports

IBM Informix TimeSeries

Based on actual benchmark test run for a large electrical

utility company

1M Meter @ 15min POC Informix TimeSeries

Competitor's Relational Database System

Improvement

Load Time 18 Minutes 7 Hours 23x faster

Report Generate Time Seconds to 11 min 2-7 HOURS 38x faster

Storage Space 350GB 1.3 TB <1/3 storage

Page 15: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

15

Informix TimeSeries vs Relational Database Systems Comparison of Published Benchmarks for Meter Data Management

Meters Daily Reads

Total Cores

Total RAM

DB cores

App cores

DB RAM

App RAM

100M 4.9B 16 500 16 (shared) 500 (shared)

5.5M 1.06B 456 3668 96 360 768 2900

Daily Readings (meters * registers * intervals)

1,061,500,000

4,900,000,000

The Competition

Informix TimeSeries

Informix TimeSeries

The Competition *

Database Resources (CPU cores)

96

16

The Competition

Informix TimeSeries

5 times the performance < 1/5 the resources

… with significantly simpler management using a single node system

* Based on published Oracle benchmark

http://www.oracle.com/us/industries/utilities/performance-benchmark-exadata-wp-161572.pdf

Page 16: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

16

Informix TimeSeries – Success with leading MDM providers

. “We feel that the Informix

TimeSeries technology lends

itself extremely well to the

performance characteristics, data

types and high-volume

processing required for the next

generation of smart metering and

smart grid applications.”

- David Hubbard, CTO and

co-founder of Ecologic Analytics

MDM offers scalability to 100 million smart meters

“Performance testing results of

AMT-SYBEX’s Affinity Meterflow

meter data management (MDM)

application using IBM Informix

TimeSeries software has

demonstrated the capability to

offer linear scalability up to 100

million meters to load and

process meter data at 30-minute

intervals in less than 8 hours.”

http://www.metering.com/node/20062 http://smart-grid...

Page 17: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

17

Client Success: Hildebrand

What’s Smart?

• Energy management thru real time monitoring • Ability to monitor scales to over 3 million homes • Reduced energy

Business Benefits

• Helping people make better decisions about energy efficiency in the home.

• Collect, store and analyse up to • 50,000 data points per second • Delivers high performance on low-cost hardware by leveraging

Informix time series data management technologies.

Hildebrand is a technology consultant company on the Digital Environment Home Energy

Management System (DEHEMS) project. The Hildebrand team was asked by the UK government to find

a way to scale up its energy monitoring solution and enable it to monitor three million homes.

Page 18: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

18

Hildebrand: 3 Million Meters

• Test involved 3,000,000 Homes

• Data generated every 6 seconds!

• Up to 26 values recorded

• Meter ID

• Timestamp

• 3 Electricity phases

• 1 Gas reading

• 20 Individual electrical sockets in the house

• Data collected every 6 seconds

• Aggregated to per-minute readings

• Minute readings bulk-loaded into Informix every 10 minutes

• Average load was about 50,000 inserts per second

• Hardware/Software Used

• Intel with 8 cores running:

• 64 bit SUSE Linux v10

• 16 GB of memory

• Informix workgroup edition

• Hildebrand software

Page 19: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

19

Consumer Education

Page 20: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

20

AMT/SYBEX Proving the concept for smart metering data management

• The Need: • Extend AMT/Sybex’s Data Transfer Solution to:

• Load, validate, store, and provide smart meter interval and event data to external system

• The Challenge: • Process the enormous volume of data in a timely

manner • Not to require a huge investment in new hardware

• The Solution: • Combined effort with IBM Research and IBM Informix • Informix TimeSeries provides the core capabilities to

store and process the meter and event data: • VEE • Real-time energy monitoring • Analytics for developing new tariff rates • Help to smooth peaks in demand

“We owe a great deal to

IBM, both for the Informix

technology itself, and for the

fantastic support from all

levels of the organisation.” — Gordon Brown, DTS Product

Owner, AMT-SYBEX

Solution components:

IBM® Informix® TimeSeries™

IBM Research

Page 21: Ugif 12 2011-smart meters-11102011

© 2011 IBM Corporation

Integrated Data Management

21

AMT/Sybex Tests and Performance Measurements

The main tests for AMT-SYBEX were to prove key business functions at high

volume: • Can high volumes of data be loaded into Smart DTS?

• Can data be analyzed and processed in Smart DTS in a timely manner?

To facilitate these requirements, the following tests were carried out: • These results are for 10 million meters storing data every ½ hour data for one day.

Module Time Readings/intervals per second

Technical Validation and Transformation

13 minutes >600,000

High Speed TimeSeries database load (with full logging)

50 minutes >150,000

Validation and Estimation 33 minutes >240,000

Full processing end to end for 10 million meter points with half-hourly interval

data on a single mid-range P-series 8 CPU server is just over 1 ½ hours