Upload
andreas-breitfeld
View
187
Download
2
Embed Size (px)
Citation preview
Informix Warehouse Accelerator on Cluster
Andreas Breitfeld Session D01
IBM Monday 04/23 9:30a
Agenda
Informix Warehouse Accelerator (IWA) on Cluster
● Overview● Hardware and software prerequisites
– components to build a cluster of cheap computers for IWA– install and configure additional software
● IWA cluster– configuration and administration
● IWA cluster demo– shows scaling for load and query tasks
4/19/12 IWA on Cluster - Session D01 2
Overview
Goals of IWA on cluster● use common available network and PC
hardware● scale memory
– overcome memory limit
● scale processors– overcome processor socket limit
4/19/12 IWA on Cluster - Session D01 3
Overview
Implementation● same product on a different infrastructure
– coordinator and each worker run on different nodes– communicate via real network interface– use a cluster filesystem (e.g. GPFS, OCFS2) to share
● the accelerator software and config files ● the storage directory for catalog, marts, logs,
traces etc
4/19/12 IWA on Cluster - Session D01 4
Overview
General picture
4/19/12 IWA on Cluster - Session D01 5
Informix Server
node103worker
switch
node102worker
iSCSI target
node101coordinator
switch
Overview
Prepare demo● run example query on Informix Server without
acceleration
time dbaccess demo q.sql● Start it now because it runs some time!
4/19/12 IWA on Cluster - Session D01 6
Hardware prerequisites
IWA cluster components● network
– Gigabit Ethernet (GbE) switch● jumbo frame (MTU:9000) support for iSCSI
– Cat 5e Ethernet cables
4/19/12 IWA on Cluster - Session D01 7
Hardware prerequisites
IWA cluster components● nodes
– processor● 64bit AMD or Intel● SSE3 instructions - supported by Intel Atom
upwards– memory
● same amount on each node● limited per processor and chipset design
– disk● local SATA disk for Linux OS
4/19/12 IWA on Cluster - Session D01 8
Hardware prerequisites
IWA cluster components● nodes (cont.)
– network interfaces● 1 GbE for intra node and DRDA communication, e.g.
eth0● 1 dedicated GbE recommended for iSCSI, e.g. eth1
– jumbo frame (MTU:9000) support● IP addresses can be from private IPv4 address
space, e.g.– 172.16.0.0 – 172.31.255.255
● hostnames are aliases to IP addresses on intra node interfaces
4/19/12 IWA on Cluster - Session D01 9
Hardware prerequisites
IWA cluster components● storage
– iSCSI target served by a dedicated node or SAN device
● provides a shared disk device● used for cluster filesystem
4/19/12 IWA on Cluster - Session D01 10
Hardware prerequisites
Examples for n IWA nodes● built from standard components
– 2 GbE switches with n+1 ports or one with 2*(n+1) ports– 2*(n+1) Cat 5e cables– 1 AMD or Intel PC as iSCSI target shared disk (or SAN device)
● recommended disk space: 2*n*WORKER_SHM or greater– n AMD or Intel PC as IWA node
● example AMD PC: FX-4/6/8xxx, max. 32GB RAM● example Intel PC: Core i7-3820/3930, max. 64GB RAM
4/19/12 IWA on Cluster - Session D01 11
Software prerequisites
Install and configure● Network Time Protocol
– syncronize time between all nodes
● ssh server and client – configure login without password between IWA
nodes● for user root (or informix)
4/19/12 IWA on Cluster - Session D01 12
Software prerequisites
Install and configure (SLES 11)● iSCSI
– iscsitarget (target)– open-iscsi (initiator)– configuration possible with Yast modules
● iSCSI Target on target node– add a partition or disk target
● iSCSI Initiator on IWA nodes– discover targets– login to target on IP address at iSCSI interface, e.g.
eth1 – select automatic login at startup
4/19/12 IWA on Cluster - Session D01 13
Software prerequisites
Install and configure (SLES 11)● cluster filesytem – OCFS2
– OCFS2 packages available on SLES 11 High Availability Extension media
● ocfs2-kmp-default● ocfs2-tools● ocfs2-tools-o2cb● ocfs2console
– configure OCFS2● on 1st IWA node create a partition on the iSCSI shared disk
– Yast Partitioner or fdisk● on 1st IWA node create /etc/ocfs2/cluster.conf
– ocfs2console or editor
4/19/12 IWA on Cluster - Session D01 14
Software prerequisites
Example /etc/ocfs2/cluster.conf :
node:
name = node101
cluster = iwa
number = 0
ip_address = 172.16.42.101
ip_port = 7777
node:
...
cluster:
name = iwa
node_count = 3
4/19/12 IWA on Cluster - Session D01 15
Software prerequisites
Configure (SLES 11)● cluster filesytem – OCFS2
– configure OCFS2 (cont.)● copy /etc/ocfs2/cluster.conf to the other nodes● run on all nodes
/etc/init.d/o2cb load
/etc/init.d/o2cb configure– accept defaults– enter cluster name to start an boot, e.g. iwa
4/19/12 IWA on Cluster - Session D01 16
Software prerequisites
Configure (SLES 11)● cluster filesytem – OCFS2
– configure OCFS2 (cont.)● create OCFS2 filesystem, e.g.
mkfs.ocfs2 -L iwa -N 5 /dev/sdb1● locate the persistent device path in /dev/disk/by-id/
– its a symbolic link to example /dev/sdb1 , e.g.scsi-1494554000000000030000000000000000000000000000000-part1
● add line to /etc/fstab on the IWA nodes, e.g./dev/disk/by-id/scsi-1494554000000000030000000000000000000000000000000-part1 /iwa ocfs2 _netdev 0 0
4/19/12 IWA on Cluster - Session D01 17
Software prerequisites
Configure (SLES 11)● cluster filesytem – OCFS2
– configure OCFS2 (cont.)● create the mount point, e.g. mkdir /iwa● mount the OCFS2 filesystem on the IWA nodes, e.g.
mount /iwa● start services at boot time
chkconfig o2cb onchkconfig ocfs2 on
● make sure services are started in correct order at boot time
1. iscsitarget2. open-iscsi (iscsi)3. o2cb4. ocfs2
4/19/12 IWA on Cluster - Session D01 18
IWA cluster
Configuration● install IWA to $INFORMIXDIR in cluster
filesystem– same $INFORMIXDIR on all nodes
● edit $INFORMIXDIR/dwa/etc/cluster.conf– list of cluster nodes– one node (hostname or IP address) per line– 1st node will get the coordinator, the rest worker
4/19/12 IWA on Cluster - Session D01 19
IWA cluster
Configuration● edit $INFORMIXDIR/dwa/etc/dwainst.conf
– set DWADIR to a directory on cluster filesystem– set NUM_NODES to the number of nodes in
cluster.conf– set WORKER_SHM to the sum of shared memory
(SHM) on worker nodes● leave some memory for OS and temporary use,
e.g. for n IWA nodes
WORKER_SHM=(n-1)*mem*0.75
4/19/12 IWA on Cluster - Session D01 20
IWA cluster
Configuration (example)● Hint:
– SHM is used from /dev/shm– default size is 50% of mem– check:
df -k /dev/shm
Filesystem 1K-blocks Used Available Use% Mounted on
tmpfs 2020096 84 2020012 1% /dev/shm
– modify, e.g. to 3GB● on the fly
mount -o remount,size=3G /dev/shm ● save in /etc/fstab
tmpfs /dev/shm tmpfs size=3G 0 0
4/19/12 IWA on Cluster - Session D01 21
IWA cluster
Configuration● edit $INFORMIXDIR/dwa/etc/dwainst.conf (cont.)
– set DRDA_INTERFACE● network interface for intra node communication, e.g. eth0● Informix server should be connected via a dedicated GbE interface to
this network for optimal load performance– uncomment and set CLUSTER_INTERFACE
● network interface for intra node communication, e.g. eth0● interface must have the same name on all nodes
– set parameters to optimize IWA for one worker per node
CORES_FOR_SCAN_THREADS_PERCENTAGE=100
CORES_FOR_LOAD_THREADS_PERCENTAGE=100
4/19/12 IWA on Cluster - Session D01 22
IWA cluster
Configuration● Hint:
– set Linux kernel parameters in /etc/sysctl.conf
# reboot after 30sec of kernel panic / oops
kernel.panic_on_oops = 1
kernel.panic = 30
# do not allow memory over-commitment at all
vm.overcommit_memory = 2
vm.overcommit_ratio = 99– run command: sysctl -p
4/19/12 IWA on Cluster - Session D01 23
IWA cluster
Administration● ondwa utility
– same syntax as on single node– can be run on any node– hostname in $INFORMIXDIR/dwa/etc/cluster.conf is
used for ssh to IWA nodes● start / stop DWA_CM processes● check DWA_CM is stopped
– run as user root
4/19/12 IWA on Cluster - Session D01 24
IWA cluster
Administration● ondwa utility (cont.)
– if run as user informix● set following resources to unlimited on all nodes
– memlock (max locked-in-memory address space)– rss (max resident set size)– as (address space limit)
● example /etc/security/limits.conf
informix soft memlock unlimitedinformix hard memlock unlimited...
4/19/12 IWA on Cluster - Session D01 25
IWA cluster demo
Simplified setup● “nano cluster” (Intel Atom, 4GB RAM)
– node101 runs ● Informix Server● iSCSI target● IWA coordinator
– node102 runs● IWA worker
– node103 runs● IWA worker
4/19/12 IWA on Cluster - Session D01 26
IWA cluster demo
Simplified picture
4/19/12 IWA on Cluster - Session D01 27
node103worker
switch
node102worker
Informix Server
ISCSI target
coordinator
node101
IWA cluster demo
Informix Server● example database
– fact table # of rows: 10,000,000– dimension tables #: 19
● average # rows: ~7,000● range # rows: 2 - 100,000
4/19/12 IWA on Cluster - Session D01 28
IWA cluster demo
Informix Server● Example query runtime without acceleration
– update statistics :-)– run query
time dbaccess demo q.sql– results
● 1st run: 15m1.305s● 2nd run: 14m49.357s
4/19/12 IWA on Cluster - Session D01 29
IWA cluster demo
IWA on 2 nodes - 1 worker● show the status of nodes
ondwa status● create and load mart
time java createMart NANO demo.xml
time java loadMart NANO demo NONE● run example query on Informix Server with acceleration
time ((echo "set environment use_dwa '3';"; cat q.sql) | dbaccess demo -)
● results– load: 0m56.347s– query: 0m7.178s
4/19/12 IWA on Cluster - Session D01 30
IWA cluster demo
Reconfigure IWA● stop IWA
ondwa stop● add node103 to $INFORMIXDIR/dwa/etc/cluster.conf● change NUMNODES from 2 to 3 in
$INFORMIXDIR/dwa/etc/dwainst.conf● refresh IWA configuration
ondwa setup● start IWA
ondwa start● show the status of nodes
ondwa status
4/19/12 IWA on Cluster - Session D01 31
IWA cluster demo
IWA on 3 nodes - 2 workers● drop and recreate mart
time java dropMart NANO demo
time java createMart NANO demo.xml
time java loadMart NANO demo NONE● run example query on Informix Server with acceleration
time ((echo "set environment use_dwa '3';"; cat q.sql) | dbaccess demo -)
● results– load: 0m45.012s– query: 0m3.850s
4/19/12 IWA on Cluster - Session D01 32
IWA cluster demo
Conclusion● compares 2 nodes : 3 nodes configurations● load mart performance improvement is less than expected:
~56.3sec : ~45.0sec– limited network throughput
● no dedicated iSCSI interfaces– backup of data marts on cluster filesystem
– limited processor power● Informix, iSCSI target and IWA coordinator on one node
● example query performance scaling is good:
~7.2sec : ~3.8sec– shows distribution of tasks on the 2 worker nodes– query acceleration factor is more than 100 : 200
4/19/12 IWA on Cluster - Session D01 33
Questions?!?
4/19/12 Template Presentation - Session D01 34
The Sandbox is open April 23 – April 25 Cabrillo Salon 1
36
Please Note:
IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion.
Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision.
The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion.
Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here.
Acknowledgements and Disclaimers:
© Copyright IBM Corporation 2012. All rights reserved.
– U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
IBM, the IBM logo, ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml
Other company, product, or service names may be trademarks or service marks of others.
Availability. References in this presentation to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates.
The workshops, sessions and materials have been prepared by IBM or the session speakers and reflect their own views. They are
provided for informational purposes only, and are neither intended to, nor shall have the effect of being, legal or other guidance or advice to any participant. While efforts were made to verify the completeness and accuracy of the information contained in this presentation, it is provided AS-IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this presentation or any other materials. Nothing contained in this presentation is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software.
All customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer. Nothing contained in these materials is intended to, nor shall have the effect of, stating or implying that any activities undertaken by you will result in any specific sales, revenue growth or other results.