42
© 2010 IBM Corporation December 14, 2011 TimeSeries Technical Presentation Jacques Roy

Ugif 12 2011-france ug12142011-tech_ts

  • Upload
    ugif

  • View
    370

  • Download
    5

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Ugif 12 2011-france ug12142011-tech_ts

© 2010 IBM Corporation December 14, 2011

1

TimeSeries Technical Presentation

Jacques Roy

Page 2: Ugif 12 2011-france ug12142011-tech_ts

2 © 2010 IBM Corporation

Agenda

■ Review from last April

■ New features in xC3

■ New features in xC4

■ Demo: IBM Informix TimeSeries Plug-in for Data Studio

Page 3: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 3

Who’s Interested in TimeSeries

Energy: smart meters

Capital Markets

– Arbitrage opportunities, breakout signals, risk/return optimization,

portfolio management, VaR calculations, simulations, backtesting...

Telecommunications:

– Network monitoring, load prediction, blocked calls (lost revenue)

from load, phone usage, fraud detection and analysis...

Manufacturing:

– Machinery going out of spec; process sampling and analysis

Logistics:

– Location of a fleet (e.g. GPS); route analysis

Scientific research:

– Temperature over time...

Page 4: Ugif 12 2011-france ug12142011-tech_ts

4 © 2010 IBM Corporation

TimeSeries: Key Concepts

■ Containers – Specialized storage for TimeSeries EXECUTE PROCEDURE

TSContainerCreate('raw_container', 'rootdbs',

'meter_data', 100, 50);

■ Timeseries data element: row type – Flexibility to define as many parts as needed CREATE ROW TYPE meter_data (

tstamp datetime year to fraction(5),

value decimal(14,3)

);

■ Timeseries types: regular, irregular – Covers regular intervals and sparse data distribution

■ Calendar – Defines business patterns

Page 5: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 5

Features Unique to Regular TimeSeries

Only one element per “on” interval

Value "persists" to end of interval

An element for an “on” interval may be missing, entire

element will be NULL

Calendar determines offset in TimeSeries of given time point

Elements can be accessed by offset or time point

Time point not stored; calculated from header + date/time

arithmetic

Precision to second

Page 6: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 6

Features Unique to Irregular TimeSeries

Data can be entered at any time point within a valid "on"

interval

Element persist until next element

No NULL elements

Elements can only be accessed by time

No duplicate time points allowed

If element already exists at given time point either an error is

raise or a unique time point is found:

– round time point up to nearest second

– search back for first element

– add 10 microseconds, this is new time point

Page 7: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 7

Accessing Timeseries

Access through standard tabular view

– Makes TimeSeries look like a standard relational table

SQL Functions

– 103 functions

Customized functions

– Written in Stored Procedure Language (SPL), “C”, Java

– 65 “C” functions

Page 8: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 8

TimeSeries Header

A TimeSeries needs information that sets its context:

– Calendar: Time period where data is found

– Origin: Time origin of the TimeSeries

– Threshold: in-row storage threshold

– Container: where to store the out-of-row data

– Metadata: optional data added by the TimeSeries creator

Page 9: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 9

Calendar and Calendar Patterns

A calendar pattern is needed before we can create a calendar: INSERT INTO CalendarPatterns

VALUES('day', '{1 on, 2 off, 4 on}, day' );

A Calendar defines a set of valid times at which the TimeSeries can record

data. (July 8, 2005 is a Friday) INSERT INTO CalendarTable(c_name, c_calendar)

VALUES('ts_1day', 'startdate(2011-07-08 00:00:00.00000),

pattstart(2011-07-08 00:00:00.00000),

pattname(day)' );

You can provide a pattern explicitly: INSERT INTO CalendarTable(c_name, c_calendar)

VALUES('ts_1week', 'startdate(2011-07-08 00:00:00.00000),

pattstart(2011-07-08 00:00:00.00000),

pattern({1 on, 2 off, 4 on}, day)' );

Page 10: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 10

TimeSeries: Table

A TimeSeries resides in a table:

CREATE TABLE ts_data (

loc_esi_id char(20) NOT NULL,

measure_unit varchar(10) NOT NULL,

direction char(1) NOT NULL,

multiplier TimeSeries(meter_data),

raw_reads timeseries(meter_data),

PRIMARY KEY(loc_esi_id, measure_unit, direction)

) LOCK MODE ROW;

Page 11: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 11

Populating a TimeSeries

A timeSeries must first be created: INSERT INTO daily_max

VALUES("100700411",

TSCreate('ts_1day', '2011-07-08 00:00:00.00000',

20, 0, 0, 'container1')

);

It can be created through the input function: INSERT INTO failure_event

VALUES("10110475",

'irregular, container(container2),

origin(2011-01-01 00:00:00.00000),

calendar(cal1sec),

[(4.48, . . .)@2011-04-03 06:30:03.00003,

(4.50,. . .)@2011-07-03 06:30:03.00119,

. . .]'

);

Page 12: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 12

The Virtual Table Interface

Makes a TimeSeries look like a table: EXECUTE PROCEDURE

TSCreateVirtualTab('ts_data_v', 'ts_data',

'origin(2010-11-10 00:00:00.00000),

calendar(ts_15min),container(raw_container),

threshold(0), regular',

0, 'raw_reads');

Virtual table created: CREATE TABLE ts_data_v (

loc_esi_id char(20),

measure_unit varchar(10,0),

direction char(1),

tstamp datetime year to fraction(5),

value decimal(14,3)

);

Page 13: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 13

Quick Review

A TimeSeries resides in a container

– The container resides in a dbspace

– The container is for a specific element type (row type)

– A container is for either a regular or irregular TimeSeries (not both)

– A container can contain multiple TimeSeries

A TimeSeries requires a calendar

– Defines when the data starts, defines a pattern of valid values

A TimeSeries data is defines as a row type

– Defines the values tracked

You can operate on TimeSeries through special SQL functions or

use the virtual table interface and standard SQL

Page 14: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 14

New Features in 11.70.xC3

New platforms supported

Stores demo

18-character limit removal

Pre-defined calendars

Container statistics

XML generation

Delete performance

Auto-create container

VTI expression pushdown

Page 15: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 15

New platforms supported

TimeSeries is part of the Informix server as of 11.70.xC3

New platforms that include TimeSeries: – Redhat Linux 32- and 64-bit environment

– Mac OS 10

– Linux on pSeries

– Solaris x86_64

See:

https://www-304.ibm.com/support/docview.wss?rs=630&uid=swg27013343&S_CMP=rnav

Page 16: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 16

Stores Demo

Script $INFORMIXDIR/bin/dbaccessdemo

– Adds smart meter information to customers

– Can be disables using -nots

Includes calendars: – ts_1min, ts_15min, ts_30min, ts_1hour, ts_1day, ts_1week, ts_1month

Row types: – meter_data, meter_data2

Tables: – ts_data: TimeSeries information

– customer_ts_data: Ties the customer table to ts_data

– ts_data_location: provides geo location for the meters

All data fictitious, real locations with no ties to the data

Page 17: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 17

18-Character Limit Removal

Container name limit raised to 128 characters

Subtype name raised to 128 characters

Longer container name supported in string representation

Tables affected:

– TSInstanceTable

– TimeSeries instance information

– TSContainerTable

– Container information table

SetContainerName(ts, container_name)

– Second argument increased size

Page 18: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 18

Pre-Defined Calendars

Goal: Provide common calendars

Calendars added: – ts_1min

– ts_15min

– ts_30min

– ts_1hour

– ts_1day

– ts_1week – Starts on Sunday

– ts_1month

Page 19: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 19

Container Statistics Functions

All functions take a container name or NULL

– If NULL, cover all containers

TSContainerTotalPages

– Total number of pages allocated to a container

TSContainerTotalUsed

– Number of pages containing TimeSeries data

TSContainerNElems

– Total number of elements stored in a container

TSContainerPctUsed

– (used * 100) / total

TSContainerUsage – Returns used, nelems, and total

Page 20: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 20

Container Statistics Functions

EXECUTE FUNCTION TSContainerTotalUsed(NULL);

pages

2029

EXECUTE FUNCTION TSContainerTotalPages(NULL);

total

2169

EXECUTE FUNCTION TSContainerNElems(NULL);

elements

241907

EXECUTE FUNCTION TSContainerPctUsed(NULL);

percent

93.545

Page 21: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 21

Container Statistics Functions

EXECUTE FUNCTION TSContainerUsage(NULL);

pages slots total

2029 241907 2169

Page 22: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 22

XML Generation

Ability to Generate XML from a TimeSeries

Function syntax: CREATE FUNCTION TSToXML(

doctype lvarchar,

id lvarchar,

ts TimeSeries,

output_max integer DEFAULT 0)

RETURNS lvarchar;

Page 23: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 23

XML Generation Parameters description

doctype: Name of the topmost XML element

id: Identification for the TimeSeries

ts: TimeSeries to process

output_max: Max output size, default of 0 represents 32KB

minimum of 4096, maximum of 2GB

Output additonal element: <AllData>

<AllData>1</AllData>: All data was returned

<AllData>0</AllData>: The TS data was truncated to fit the

output max.

Still a valid XML document

Page 24: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 24

XML Generation Example

SELECT TSToXML('meterdata', esi_id,

Clip(actual, '2010-09-08 12:00:00'::datetime year to second,

'2010-09-08 13:00:00'::datetime year to second

) )

FROM meters

WHERE esi_id = '22505613346851835';

<meterdata>

<id>2250561334</id>

<AllData>1</AllData>

<meter_data>

<tstamp>2010-09-08 12:15:00.00000</tstamp>

<value>0.9170000000</value>

</meter_data>

<meter_data>

. . .

</meterdata>

Page 25: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 25

Delete Performance

Deleting a row that includes a TimeSeries Same code used in DelTrim()

Tests deleting all elements of TimeSeries of different sizes:

# of elements

Previous TS Version

New TS Version

Improvement

10,000 0m0.409s 0m0.159s 2.5X

30,000 0m0.630s 0m0.315s 2X

100,000 0m2.894s 0m1.132s 2.5X

300,000 0m15.988s 0m5.159s 3X

1,000,000 5m45.837s 0m18.330s 18X

3,000,000 146m52.861s 1m26.456s 101X

Page 26: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 26

Auto-Create Container Feature

Automatic creation of containers

Container pools

Round-robin insertion into containers

Page 27: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 27

Automatic Creation of Containers

Feature: Find or create a container to use

Benefit: Avoid errors in insert if the container name is not set

When

– a new TimeSeries is inserted

– No “autopool” container available

What

– New container(s) created

– Container names: autopoolnnnnnnnn

– Added to the default pool: autopool

Where

– Each dbspace used by the target table

Page 28: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 28

Container Pools

New column (pool) in TSContainerTable

autopool: default pool of containers created by auto-create

container

Create containers (as before) TSContainerCreate(<container_name>, <dbspace>,

<initial_size_kb>, <next_size_kb>);

Add a container to a pool TSContainerSetPool(<container_name>, <pool_name>)

– Pool created automatically

– A container can be in only one pool at a time

Remove a container from its current pool TSContainerSetPool(<container_name>)

Page 29: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 29

Container Selection

Search autopool for a set of candidate containers

Select the container with the greatest amount of free space

Page 30: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 30

Pool Policy: TSContainerPoolRoundRobin

Select a container from a pool in a round robin manner and

return the name

Example*:

INSERT INTO smartmeters(meter_id,rawreadings) VALUES('met00001','origin(2006-01-01 00:00:00.00000),

calendar(smartmeter),regular,threshold(0),

container(TSContainerPoolRoundRobin(readings)),

[(33070,-13.00,100.00,9.98e+34),

(19347,-4.00,100.00,1.007e+35),

(17782,-18.00,100.00,9.83e+34)]');

Position kept in memory

* Note: the call to TSContainerPoolRoundRobin is a string

Page 31: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 31

Pool Policy: TSContainerPoolRoundRobin

function TSContainerPoolRoundRobin(

tabname lvarchar,

colname lvarchar,

subtype lvarchar,

irregular integer,

poolname lvarchar)

returns lvarchar;

tabname: name of the table where the TS value is being inserted

colname: name of the TimeSeries column where the TS value is

being inserted subtype: subtype TS value (row type) is being inserted

irregular: kind of TS value (0 – regular, 1 – irregular, integer)

poolname: name of the container pool to use

Page 32: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 32

Pool Policy: Create your own

function <pool_policy_name>(

tabname lvarchar,

colname lvarchar,

subtype lvarchar,

irregular integer

[,user_data lvarchar] )

returns lvarchar;

Tailor the container selection policy to fit your environment

Page 33: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 33

TS VTI Improvements

Make the NOT NULL contraint on base table visible through VTI

Use only the primary key columns to locate the appropriate row

to update

– Benefit: Allow NULLs in non-primary key columns

– If the primary key of the base table is altered, any TS VTI tables

based on that base table will need to be recreated

– If no primary key, use all non-TS columns

New TSVTMode flags:

– TS_VTI_UPDATE_NONKEY_NOT_NULLS (32)

– Only not null updatable columns of the insert row will be added to the set

clause

– TS_VTI_UPDATE_NONKEY_INCLUDE_NULLS (64)

– all the updateable columns of the insert row will be added to the set

clause

Page 34: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 34

VTI Expression Pushdown

Provide pre-processed views of TimeSeries

Example:

“Create a hourly moving average and get the max value per day”

(Table: smartmeters, column: readings)

EXECUTE PROCEDURE TSCreateExpressionVirtualTab(

'smartmeters_expr1_vti', 'smartmeters',

"AggregateBy('max($value)','smartmeter_daily',

Apply('TSRunningAvg($energy, 4)',

$ts_begin_time, $ts_end_time,readings

)::TimeSeries(onereal_row),

0)",'onereal_row');

Page 35: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 35

VTI Expression Pushdown: Result

SELECT * FROM smartmeters_expr1_vti

WHERE t >= '2011-01-01 00:00:00.00000'::datetime year to

fraction(5)

AND t <= '2011-01-02 23:59:59.99999'::datetime year to

fraction(5);

meter_id t value

met00000 2011-01-01 00:00:00.00000 30.25000000000

met00000 2011-01-02 00:00:00.00000 29.50000000000

met00001 2011-01-01 00:00:00.00000 29.75000000000

met00001 2011-01-02 00:00:00.00000 31.00000000000

. . .

Page 36: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 36

VTI: TSCreateExpressionVirtualTab

More than one “view” can be created on a TimeSeries

Very similar to TSCreateVirtualTab but read-only interface

TSCreateExpressionVirtualTab(

VirtualTable VARCHAR(128,1),

BaseTablename VARCHAR(128,1),

expression LVARCHAR,

subtype VARCHAR(128,1)

TSVTMode INTEGER DEFAULT 0

TSColName VARCHAR(128,1) DEFAULT NULL);

Page 37: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 37

VTI Expression Pushdown

Keyword substitution: $ts_begin_time, $ts_end_time

Used to substitute the time from the runtime query in the expression view:

Example expression:

Apply('TSRunningAvg($energy, 4)',

$ts_begin_time, $ts_end_time,readings

)::TimeSeries(onereal_row)

Query including: WHERE t >= '2011-01-01 00:00:00.00000'::datetime year to fraction(5)

AND t <= '2011-01-02 23:59:59.99999'::datetime year to fraction(5);

Page 38: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 38

New Features in 11.70.xC4

TSRollup

Eclipse TimeSeries tooling for fast loading

Bug fixes and performance improvements

Page 39: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 39

TSRollup

SQL aggregate function

Rollup multiple TimeSeries into one

SELECT TSRollup(tscol,

"sum($releve), avg($puissance)"

)::timeseries(myrollup_t)

FROM . . .

WHERE . . .

GROUP BY . . .

Page 40: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 40

The IBM Informix TimeSeries Plug-in for...

Official name:

IBM Informix TimeSeries Plug-in for Data Studio

Requires IBM Data Studio

– IBM Data Studio version or IBM Optim Developer Studio, version 2.2.1 or later

– Should work with Eclipse

– Part of the Informix media

– Can be downloaded from IBM:

– Go to: http://www.ibm.com/developerworks/downloads/im/data/

– Select IDE Red Hat Linux, SUSE Linux, Windows (826MB)

Page 41: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 41

Installing the plug-in

Plug-in location: (any platform)

– $INFORMIXDIR/extend/TimeSeries.5.00.xC2/plugin/ts_datastudio.zip

– Move this file to the machine that runs Data Studio

Start Data Studio

Select Help → Software Updates...

– Select Add Site...

– Select Archive...

– Navigate to the zip file and add it

– Select exported repositories

– Click install

Page 42: Ugif 12 2011-france ug12142011-tech_ts

IBM Information Management

© 2007 IBM Corporation Informix Dynamic Server, TimeSeries DataBlade Module class 42

Demo IBM Informix TimeSeries Plug-in for Data Studio