Performance Tuning of MDA Based Platform & Product Niraj Trivedi [email protected]

Performance Tuning of MDA Based Platform & Product

Niraj Trivedi

[email protected]

mailto:[email protected]

• Introduction to MDA based development platform

• Major performance issues encountered

• Setup of Performance Engineering Lab

• Categorization of Major performance issues

• Reference Model for Application Profiling & Tuning

• Root Cause Analysis

• Trade offs between ease of development and product

performance

Agenda

Context / Introduction

We are going to talk about a platform that has resulted in..

STG Insurance Suite – Analyst Recognition

Market Recognition

5

Platform (Design & Development Perspective)

Platform – Design time components

4

Rules Engine

Object Modeler

Workflow Engine

UI Designer

Release Manager

Rate Builder

Product Modeler

EclipseJBOSS Rules

YAWLFlash

BuilderSVN

Editor Editor Plug In Plug In Plug In

ICM

ICD

CLM

Identify Modules to Build

Leverage Open Source Components

Build Plug Ins & Editors

Package to form ICM and ICD

Platform – Runtime Components – Simplified View

8

Application Server

OS (AIX / Linux / Windows)

App Server (Jboss / WebSphere/…)

ICD/ICM Runtime Engine

(Uses Servlet Threads)

ICD/ICM Content

Database Server

OS (AIX / Linux / Windows)

RRDBMS(Oracle/ SQL Server/…)

Base Schemas

DatabaseConnectionPool

UI / Interface Components

SOAP, RESTful, Adobe Blaze

Content & Runtime – Component details

9

ICD / ICM Content

Model Definition(XML)

Rules Definition (XML)

Workflow Definition (XML)

Instruction Step Defn(XML)

Runtime Engines

Model Load/Save/Map(ICD Engine)

Workflow Executor (YAWL Engine)

Rule Executor (Jboss/Drools)

Instruction Step Exec(ICD Engine)

UI Model Rendering(Flash Player)

UI Model Defn(MXML)

MAJOR PERFORMANCE ISSUES

10

Major Performance Issues

Performance of Business Process Execution (Single user mode)

Performance of Business Rules Execution (Single user mode)

Performance of Model Load and Save (Single user mode)

System unable to support required concurrency

System not able to scale horizontally and / or vertically

SETUP OF PERFORMANCE ENGINEERING LAB

12

13

Holistic Performance Engineering

ObjectivesRespTime Throughput Workload Utilization

Performance ModelingWorkload Service Time Q-Size Capacity Txns

Architecture & DesignAnti-Patterns Patterns Principles Frameworks Measuring, Testing, Tuning

Measuring Testing Tuning-Resp. Time - Load, - Demand-Throughput - Stress, - ServcTime- Workload - Endurance -

Capacity- Utilization -Ramp Up

Role

sA

rch

itect

D

evelo

per

Perf

En

gr

Test

En

gr

Life

Cycl

eR

eq

uir

em

en

ts

Desi

gn

Develo

p

Test

M

ain

tain

Performance Engineering Lab - Resources

14

Platform Resources

Hardware IBM Power5,IBM Power7,DELL XEON,DELL R-810, R-820

OS AIXOracle Enterprise LinuxRedhat LinuxWindows Server

Application Server WebSphere, Weblogic,Jboss,Tomact

RDBMS Oracle,SQL Server

Load Generator Neoload, Loadrunner

Performance Engineering Lab – Profiling & Investigation Tools

15

Category Tools

Application Profiling Jprofiler, JvisualVM,Hprof

Database Profiling Oracle AWR,Oracle Statpack,SQL Server Developer Studio

OS Level Nmon,Perfmon,Vmstat,Iostat,Netstat

Heap and Thread dump analysis

IBM Support Assistant

UI Profiling HttpWatch, Adobe Flex Premium Licenses

Performance Engineering Lab - Roles

16

▪ Head, Performance Engineering Lab− Owns System Performance for product and product implementations− Ensure reasonable TCO by making system exhibit desirable performance

on reasonably sized hardware

▪ Performance Architects / Engineers− Performance Modeling − Performance Profiling, Investigation and Tuning− Create architecture, design and development guidelines for performance− Sizing for implementation

▪ Performance Test Managers− Manage end to end performance test− Transaction Matrix conversion to overall test plan

• Unit, Normal & Peak Load, Stress, Ramp-up etc.

▪ Performance Test Engineers− Deep expertize with Load generator tools – scripting and execution

CATEGORIZATION OF PERFORMANCE ISSUES

17

Categorization of Performance Issues

Platform Architecture & Design

Platform Code / Code Generation Related

Content - Platform Usage Related

Application Configuration

Environment Configuration & Sizing

REFERENCE MODEL FOR PERFORMANCE PROFILING AND TUNING

19

20

First Cut Performance Model Showing All Service Centers (1)

Network between FE and APP Tier

CPU Memory Disk

Network between App and DB Tier

CPU Memory Disk

CPU Memory Disk

FE Desktop

Web + AppServer

DBServer

21

Revised Performance Model Showing Significant Service Centers (2)


CPU


CPU

CPU Disk

FE Desktop

Web + AppServer

DBServer

22

Revised Performance Model Showing Significant Service Centers (3)


CPU


CPU

CPU Disk

FE Desktop

Web + AppServer

DBServer

App to APP Calls (local

Host)

Content & Runtime – Component details

23

ICD / ICM Content

Model Definition(XML)

Rules Definition (XML)

Workflow Definition (XML)

Instruction Step Defn(XML)

Runtime Engines

Model Load/Save/Map(ICD Engine)

Workflow Executor (YAWL Engine)

Rule Executor (Jboss/Drools)

Instruction Step Exec(ICD Engine)

UI Model Rendering(MXML/FE)

ROOT CAUSE ANALYSIS & SOLUTIONS

(A FEW CASE-STUDIES BRIEF)

24

Root Cause Analysis – NB/Policy Issuance process is slower than expected

25

▪ Profiling tools used & Observations− Oracle OEM : No database related issues− Jprofiler: A lot of time spent in serialization/de-serialization of payload

objects

▪ Root Cause Determination− High number of Inter – JVM Calls between YAWL Engine and ICD Engine− High CPU time spent on serialization / deserialization of objects for

making inter jvm calls using http/SOAP protocol

▪ Solution− Convert YAWL engine based workflow to POJO− Make this part of ICD Engine

▪ Outcome− Designer has the flexibility to use YAWL engine editor to design the

workflow− Inter-JVM Calls eliminated− Need for serialization / deserialization eliminated− A new business process which used take a time upwards of 2 minutes

now completes in 3-5 seconds

App Server In Process Workflow Engine

Use of External BPEL Engine for Technical Workflow

App ServerWorkflow/ BPEL

Engine

Initiate Process

Call Service 1

Call Service 2

Call Service 3

Call Service n

End Process

A Lot of calls across JVM boundary, N/W

27

Link Performance

Typical Examples

Note Evaluation

WAN Low SOAP, REST, BlazeDS, RemoteEJB,RPC…

Protocol Having Less overhead

• Reduce Size of payload

• Reduce number of calls

LAN Medium SOAP, REST, BlazeDS, RemoteEJB,…

Protocol overhead for a single call not that significant



Local host FAST SOAP, REST, BlazeDS, RemoteEJB,…

Protocol overhead for a single call not that significant



IN PROC BEST IN CLASS

Function Call

FASTEST • Size of payload does not matter

• No of calls matters only if extra large

• Clone may be single largest issue

Evaluation Tips - Communication Link as a Service Center

28

Learnings

▪ Use of External BPEL / Workflow engine is recommended for integration between disparate systems

▪ If you start using it for process orchestration / workflow design using a single enterprise class system' various servicing stages you are bound to run into performance issues

Root Cause Analysis – Rule Execution is slower than expected

29

▪ Profiling tools used & Observations− Oracle OEM : No database related issues− Jprofiler:

► A lot of time spent in creation of java objects from SQL-Result Set► A lot of time spent in insertion and retrieval of large number of objects in agenda execution working

memory

▪ Root Cause Determination− At times rules are defined on coarse grained objects− It takes more time to fetch such objects from database and create corresponding

java objects to supply as input to rule execution− Passing all memory objects to drools engine in an attempt to simplify developer

life!

▪ Solution(s)− Rule definition on Granular objects / specially created object having exact input

data required for rule execution− Pass exact set of objects required to execute agenda instead of passing all

available

▪ Outcome− Having common fact objects between large scale batch processing and online

processing− Intelligence to scan required objects for agenda execution built-in

Root Cause Analysis – Performance of Model Load is slower than expected (Search list)

30

▪ Profiling tools used & Observations− Oracle OEM : Large number of quick processing statements building up overall

time− Jprofiler: A lot of time spent in creation of java objects from SQL-Result Set− NMON : A Lot of data transfer over network

▪ Root Cause Determination− FE or App server based memory pagination simply chokes the system for higher

volumes− App Server based memory pagination usually triggered by business requirement

to filter data via business rules or display some aggregate counters− FE Server based memory pagination is usually triggered by requirement for

instantaneous sort on search lists

▪ Solution(s)− Move away from memory pagination to DB Pagination− Where memory pagination is a must – add as much prior filtration to reduce the

result set− Forced error with appropriate message in case of count of result set exceeding a

pre-determined size for implementation

▪ Outcome− Desired response time for search lists

DB Server

App Server

UI

Request 20 records

Request All records

Return All records

Filter/Mergeand

paginate

Return 20 Records

Return 20 Records

PERFORMANCE PROBLEMPERFORMANCE PROBLEM SOLVED!

Request 20 records

Root Cause Analysis – Performance of Model Load is slower than expected (Processing)

32

▪ Profiling tools used & Observations− Oracle OEM : Large number of quick processing statements building up overall

time− Jprofiler: A lot of time spent in creation of java objects from SQL-Result Set− NMON : A Lot of data transfer over network

▪ Root Cause Determination− Lazy loading of large result set when entire data set is required by an underlying

process

▪ Solution(s)− Override lazy loading

▪ Outcome− Freedom to choose lazy loading / lazy loading override for a same model under

different process contexts

SIZE MATTERS – Right Size for Right Operation

Excels in Wrestling

Do you think he will win 100m race ?

Right Granularity of Domain Services

Right Granularity of Domain Objects

Client Name Display on UI – Tab out from policy #

getPolicy().getClient.getName()

getName()

getClient()

getPolicy(policy #)

getPolicyClientName()

Learning

35

▪ Achieving right level of granularity of services and domain objects requires thorough domain knowledge

▪ It is a one of most critical trade off between speed of development and system performance

Root Cause Analysis – Performance of Model Save is slower than expected

36

▪ Profiling tools used & Observations− Oracle OEM : No concerns− Jprofiler: setAttibutes taking up a lot of time

▪ Root Cause Determination− Over all model is loaded in memory containing multiple tables, mutiple records per

each table− Attempt to save full model based upon MERGE feature

▪ Solution(s)− Keep track of dirty records

▪ Outcome/learnings− Dirty records tracking and incremental handing is easy enough− Dirty attributes tracking is proving to be a very complex exercise!− Any attempt to create dirty attributes tracking is directly affecting capability to

create inherited models based upon existing models

Root Cause Analysis – System is unable to support required concurrency

37

▪ Root Cause Determination

− Use of caching at various levels not enabled • Hardware capacity near 100% utilization

− Inadequate size of Processing Thread pool (In our case : Servlet thread pool size)

− Inadequate size of max database connections in database connection pool

Root Cause Analysis – System is unable to scale horizontally or vertically

38

▪ Root Cause Determination

− Parallel processing not enabled and / or

− Synchronization blocks becoming critical

Even Ben Johnson Can’t run 100 m in 1 Second

Nine mothers can not produce a child in one month

Nine mothers can produce nine children in nine months

BUT

Average Time to produce a child = 1 month!Kill the Statistian!

GUIDELINE : Don’t expect even a super computer to reduce unit processing

time to too small a value! Use Effective Multi Threading Identify what can be processed in parallel without a bottleneck

Summary

40

▪ As a result of efforts put in by platform development team, product development team and performance engineering Lab, we demonstrated in IBM LABS

▪ Support 2500 concurrent users, 40 million policy base▪ 25 different business transactions ▪ 4 hours run▪ 600,000 business transactions per hour▪ Server response time < 1 sec per request, consistent

▪ Hardware : p770 in IBM Labs, USA▪ 32 core, 128 GB AppServer▪ 32 corer, 128 GB DB Server

▪ Won a deal from largest insurer

Thank You

41

Documents

Performance Tuning of MDA Based Platform & Product Niraj Trivedi [email protected]