63
VECTORWISE DATABASE TRAINING BY ZAHID QUADRI MYSQL DBA CLOVER INFOTECH PVT LTD MUMBAI

Vectorwise database training

Embed Size (px)

DESCRIPTION

Vectorwise / Ingres Database Administration guide by Zahid Quadri

Citation preview

Page 1: Vectorwise database training

VECTORWISE DATABASE

TRAINING

BY ZAHID QUADRI

MYSQL DBA CLOVER INFOTECH PVT LTD MUMBAI

Page 2: Vectorwise database training

INTRODUCTION

Page 3: Vectorwise database training

What is Vectorwise

• Vectorwise is a next generation database management system from the Actian family of database products.

• Vectorwise is targeted at analytical database applications—applications that need to process large volumes of data and perform complex operations on it to derive useful information.

• Typical examples include data warehousing, data mining, and reporting

• The main benefit of Vectorwise over traditional Ingres is significantly higher performance on data analysis tasks.

• Vectorwise is optimized to work with both memory- and disk-resident datasets, allowing it to efficiently process large amounts of data (hundreds of gigabytes).

• Note: Although it is fast for data analysis, Vectorwise is not meant to be used for traditional transaction processing. For OLTP, you can open a session to a traditional Ingres database from your Vectorwise client.

Page 4: Vectorwise database training

Vectorwise Technology

• Vectorwise introduces a new way of storing data and a completely new mechanism for evaluating queries.

• Innovations such as vectorised processing, compression, and columnar data layout allow analytical queries to be run fast on a single server, even a laptop.

• The most distinctive feature of Vectorwise is the "vectorised" method it uses for evaluating queries

• Rather than operating on single values from single table records at a time, Vectorwise makes the CPU operate on "vectors," which are arrays of values from many different records.

• Such vectorised execution brings out the best in modern CPU technology. It brings to the world of databases the high performance that modern computers exhibit for scientific calculation, gaming, and multimedia applications.

Page 5: Vectorwise database training

Storage innovations- Beating disk bottleneck

• Any database system with such a high computational speed runs the risk of becoming I/O bound.

• For this reason, the second major component of Vectorwise consists of storage innovations designed for high I/O throughput. These innovations include:

Columnar data layout Advanced compression Storage indexes • Vectorwise introduces a new storage mechanism that uses

columnar data layout, allowing analytical queries to avoid disk access for columns not involved in a query.

• While you can generally think of Vectorwise storage as a column store, Vectorwise can mix columnar and row-based storage so that certain columns that are always accessed together get stored in the same disk block.

Page 6: Vectorwise database training

Continue.......

• Layout decisions are largely handled automatically by the system, but can also be controlled by the user.

• To further avoid I/O becoming a performance bottleneck, Vectorwise introduces a number of advanced compression schemes. These schemes are designed for fast decompression.

• Finally, Vectorwise uses storage indexes. The storage indexes are small and store the minimum and maximum value per data block. The storage index, which is automatically created and maintained, enables the execution engine to rapidly identify candidate data blocks.

Page 7: Vectorwise database training

Vectorwise storage type

• Vectorwise introduces a new VECTORWISE storage structure to Ingres.

• The VECTORWISE storage type differs from traditional Ingres storage types in the following ways:

• Queries accessing VECTORWISE tables use an alternative execution module that is optimized for analytical processing.

• A query that accesses tables using VECTORWISE storage cannot access tables using other Ingres storage types.

• Some operations are not available when using the VECTORWISE storage type.

Page 8: Vectorwise database training

Vectorwise table structure

• In broad terms, Vectorwise uses columnar storage. • While data is stored and retrieved in familiar relational rows, the internal

storage is different. • Instead of storing all column values for a single row next to each other, all

rows for a single column are stored together. • This storage format has benefits for data warehouse applications, such as

reduced I/O and improved efficiency. • A relational table may have dozens or hundreds of columns. In traditional

row-oriented relational storage, the entire row must be read even if a particular query requests only a few columns.

• With column-oriented storage, a query that needs only a few columns will not read the remaining columns from disk and will not waste memory storing unnecessary values.

• While Vectorwise is fundamentally a column store, it is not strictly so. For certain types of tables, Vectorwise stores data from more than one column together in a data block on disk.

Page 9: Vectorwise database training

Data Storage Format

• A database consists of multiple files and can reside in multiple locations. • Tables can also be spread across multiple locations. • Updates, inserts, and deletes to the data and the layout of the data in the

data files are stored as log information in VWROOT/ingres/data/vectorwise/dbname/CBM/LOG.

• The data files consist of a number of blocks, which can be seen as the equivalent of pages.

• Each block contains possibly compressed data from one or more attributes.

• The size of the block can be configured with the [cbm] block_size parameter

• Also, for better sequential access performance, multiple blocks from the same column can be stored contiguously in a block group.

• Database size is unlimited because data can be spread across multiple disks. If a table or a column is dropped, its corresponding file or files are deleted and the disk space is reclaimed by the operating system.

Page 10: Vectorwise database training

Raw Table Storage Format

• The default storage format for a Vectorwise table (VECTORWISE) is also known as a RAW table.

• This format is similar to an Ingres heap table, in that the data is stored on disk in the order in which it is inserted. There are, however, some differences.

• First, the data is stored in columns and the data is compressed. • The second difference is that for each column, the system

automatically maintains simple statistics about the data distribution by storing the minimum and maximum value for ranges of tuples.

• If a query includes a filter on the column data value, the execution engine uses these min-max data values to decide whether to examine the data in a given block.

• Since min-max information is maintained in memory outside of the data block, this simple form of automatic indexing can dramatically reduce disk I/O and memory usage.

Page 11: Vectorwise database training

Data Compression

• Columnar storage inherently makes compression (and decompression) more efficient than does row-oriented storage.

• For row-oriented data, choosing a compression method that works well for the variety of data types in a row can be challenging, because compression for text and numeric data work best with different algorithms.

• Column storage allows the algorithm to be chosen according to the data type and the data domain and range, even where the domain and range are not declared explicitly. For example, an alphabetic column GENDER defined as CHAR(1) will have only two actual values (M and F), and rather than storing an eight-bit byte, the value can be compressed to a single bit, and then the bit string can be further compressed.

• Compression in Vectorwise is automatic, requiring no user intervention. Vectorwise chooses a compression method for each column, per data block, according to its data type and distribution.

Page 12: Vectorwise database training

Vectorwise Features

• Vectorwise is built upon an Ingres framework and uses many Ingres facilities and utilities.

• Information on using the Ingres facilities, utilities, and commands can be found in the Ingres documentation set.

• A Vectorwise table can be created with one of two storage types:

VECTORWISE

(Default) Stores data in columns. Puts as few columns as possible in one disk block (in most cases only one).

VECTORWISE_ROW

Stores data in rows. Puts as many columns as possible in one disk block.

Page 13: Vectorwise database training

Continue.........

• Vectorwise stores CHAR and VARCHAR data types as UTF-8, using UCS-2 with UCS_BASIC collation. NFC is the default normalization form, but it can be changed to the less common NFD using createdb -n

• Transaction Management • Vectorwise provides basic support for updates (INSERT, UPDATE, DELETE). • For performance reasons, users should avoid using UPDATE and DELETE

operations to update a single record • Support Multiple SQL statementc such as DML DDL DQL other. • Support multiple data types • Vectorwise does not support • Collation sequences, other than the UCS_BASIC column collation in tables

stored with a Vectorwise structure • User defined data types andfunctions

Page 14: Vectorwise database training

Incompatibilities With Ingres Database

• SQL Cursors Vectorwise supports read-only scrollable cursors. Cursors on Vectorwise queries cannot be used for update. Only one cursor can be open per connection. • Some Subqueries Do Not Work Some specific complex forms of subqueries that are supported in traditional Ingres do not yet work. Subqueries that do not successfully flatten and compile to a valid Vectorwise syntax produce the E_OP08C2 error. Try to rewrite these queries. • Scalar Subquery Cardinality Check Vectorwise does not perform a cardinality check on scalar subqueries. Make sure the scalar subquery returns no more than one row. • The MONEY currency symbol is fixed to '$'.

Page 15: Vectorwise database training

INSTALLATION OF VECTORWISE

Page 16: Vectorwise database training
Page 17: Vectorwise database training
Page 18: Vectorwise database training
Page 19: Vectorwise database training
Page 20: Vectorwise database training
Page 21: Vectorwise database training
Page 22: Vectorwise database training
Page 23: Vectorwise database training
Page 24: Vectorwise database training
Page 25: Vectorwise database training

To start and stop service of ingres

Page 26: Vectorwise database training

Default locations

Page 27: Vectorwise database training

• To create database in vectorwise -bash-3.2$ createdb zahid Creating database 'zahid' . . . • Login to a database -bash-3.2$ sql -uingres zahid TERMINAL MONITOR Copyright 2013 Actian Corporation • Creating table in zahid database * create table rehan(id int,name varchar(45)) \g; Executing . . . • Inserting values in table * insert into rehan values(1,'zahid') \g; Executing . . . (1 row) * select * from rehan \g; Executing . . .

Page 28: Vectorwise database training

• Create user in vectorwise

to create user use accessdb command.

• Create Location in Vectorwise

To create location use accessdb command

• Create database in specified location

[ingres@MUNMVS0225 db1]$ createdb zahid_test -dtestdb;

Creating database 'zahid_test' . . .

Creating DBMS System Catalogs . . .

Modifying DBMS System Catalogs . . .

Creating Standard Catalog Interface . . .

Creating Front-end System Catalogs . . .

Creation of database 'zahid_test' completed successfully. • Create duplicate table in vectorwise

Create table test1 as select * form test \g;

Page 29: Vectorwise database training

Location in vectorwise

Page 30: Vectorwise database training

Best practices for updates in Vectorwise database

• The most efficient way to load data into Vectorwise database is to use bulk append You can use any of the following utility to copy the database

VWLoad utility

COPY INTO

Batch insert through JDBC,ODBC or .NET Application

Page 31: Vectorwise database training

Create database in location

Page 32: Vectorwise database training

Verify database location

Page 33: Vectorwise database training

Extend database

Page 34: Vectorwise database training

Create table in alternate location

Page 35: Vectorwise database training

BACKUP OF VECTORWISE DATABASE

• There are various methods for backup of vectorwise database. copydb unloaddb ckpdb

• Backup of databases using copydb Copydb command created 2 scripts . copy.out and copy.in • Copy.out This script contain query language statement to your operating system files. The script contain copy statement for each table being copied You run the copy.out script (using the sql command) to copy tables out of the

database. • Copy.in Contain query language statement to recreate your tables, views and associated

indexes , permission and integrities and copy the operating systems file into database

Page 36: Vectorwise database training

-bash-3.2$ copydb -uingres zahid COPYDB Copyright 2013 Actian Corporation Unload directory is '/home/ingres/copydb_test'. Reload directory is '/home/ingres/copydb_test'. There are 0 sequences owned by user 'ingres'. There is one table owned by user 'ingres'. There are 0 views owned by user 'ingres'. There are 0 events owned by user 'ingres'. There are 0 procedures owned by user 'ingres'. There are 0 rules owned by user 'ingres'. COPYDB has created the scripts copy.in and copy.out. From an Ingres prompt, run sql zahid < copy.out to copy the data. -bash-3.2$ sql -uingres zahid < copy.out TERMINAL MONITOR Copyright 2013 Actian Corporation Vectorwise Linux Version VW 3.0.1 (a64.lnx/375)NPTL login Sun Aug 17 11:24:46 2014 Enter \g to execute commands, "help help\g" for general help, "help tm\g" for terminal monitor help, \q to quit continue * * * * go * * set autocommit on Executing . . .

Page 37: Vectorwise database training

Continue.....

• Check whether backup has been completed or not.

-bash-3.2$ ls -ltr

total 12

-rw-r--r-- 1 ingres ingres 256 Aug 17 23:53 copy.out

-rw-r--r-- 1 ingres ingres 461 Aug 17 23:53 copy.in

-rw-r--r-- 1 ingres ingres 53 Aug 17 23:54 rehan.ingres

• RESTORE DATABASE BACKEDUP USING COPYDB.

First drop database then create it.

-bash-3.2$ destroydb zahid

Destroying database 'zahid' . . .

Destruction of database 'zahid' completed successfully.

Page 38: Vectorwise database training

Backup of tables using copydb

Page 39: Vectorwise database training

Continue....

• Create database

-bash-3.2$ createdb zahid

Creating database 'zahid' . . .

Creating DBMS System Catalogs . . .

Modifying DBMS System Catalogs . . .

Creating Standard Catalog Interface . . .

Creating Front-end System Catalogs . . .

Creation of database 'zahid' completed successfully.

• To restore database

Sql –uingres zahid < copy.out

this will restore database

Page 40: Vectorwise database training

Backup using UNLOADDB

• -bash-3.2$ unloaddb rehan > backup.log -bash-3.2$ ls -ltr total 96 -rwxrwxrwx 1 ingres ingres 93 Aug 18 00:16 unload.ing -rwxrwxrwx 1 ingres ingres 95 Aug 18 00:16 reload.ing -rw-r--r-- 1 ingres ingres 4726 Aug 18 00:16 copy.out -rw-r--r-- 1 ingres ingres 71871 Aug 18 00:16 copy.in -rw-r--r-- 1 ingres ingres 568 Aug 18 00:16 backup.log

• -bash-3.2$ sh unload.ing TERMINAL MONITOR Copyright 2013 Actian Corporation set autocommit on set lockmode session where readlock=nolock set session with privileges=all set session authorization "$ingres"

Page 41: Vectorwise database training

Continue....

• To restore using unloaddb

Destroy and recreate database.

-bash-3.2$ sh reload.ing

/* VW_TABLE COPIES */

set session authorization ingres

copy test () from '/home/ingres/unlodadb_test/test.ingres'

with row_estimate = 3

Page 42: Vectorwise database training

Backup using ckpdb

Page 43: Vectorwise database training

Verify check point for database

Page 44: Vectorwise database training

Restore from checkpoint

Page 45: Vectorwise database training

ERROR log configuration in vectorwise

Page 46: Vectorwise database training

Error configuration of vectorwise

Page 47: Vectorwise database training

Session management

Page 48: Vectorwise database training

Check how long query has been running

• * SELECT server, db_name, session_query, (BIGINT((CURRENT_TIMESTAMP - TIMESTAMP_WITH_TZ('1970-01-01 00:00:00+00:00'))/INTERVAL '1' second) - query_start_secs) AS elapsedsecs FROM ima_server_sessions WHERE db_name NOT IN ('', 'imadb')\g

• Executing . . .

Page 49: Vectorwise database training

Monitoring Vectorwise Database

Page 50: Vectorwise database training

Check database level parameter

Page 51: Vectorwise database training

To display active configuration

Page 52: Vectorwise database training

To display disk usage for tables

Page 53: Vectorwise database training

To display stats for specific table

Page 54: Vectorwise database training

Display information about active transaction

Page 55: Vectorwise database training

Database locations

Page 56: Vectorwise database training
Page 57: Vectorwise database training

Optimization in vectorwise

• It is always recommended to run optimized once your data is first loaded.

• Optimize db is fast and simple way to optimize vectorwise for your data.

• Optimize DB does this by making an internal statistical model of your data. Vectorwise then uses that model to choose the query execution plan that will run the fastest.

• type optimized –zfq followed by the name of your database. For the example shown it will be “optimized -zfq mydatabase”.

Page 58: Vectorwise database training

To apply patch in vectorwise database

• Download latest patch from below link.

• http://esd.actian.com/product/Vectorwise/2.0/Linux_X86_64-bit/

• copy to server and extract the file using below command

• [ingres@MUNMVS0278 patch_37507]$ tar xvzf vectorwise-3.0.1-375-NPTL-linux-x86_64-p37507.tar.z

• A directory with name patch37507 will be created.

• Now to apply patch goto /patch_37507/patch37507/utility location and run below command

• [ingres@MUNMVS0278 utility]$ ./iiinstaller

• Now restart vectorwise server to apply the changes

Page 59: Vectorwise database training

Roles in vectorwise

• -bash-3.2$ sql -uingres -d iidbdb

TERMINAL MONITOR Copyright 2013 Actian Corporation

Vectorwise Linux Version VW 3.0.1 (a64.lnx/375)NPTL login

Fri Aug 22 09:53:46 2014

continue

* create user duggu with password='duggu' \g;

Executing . . .

• Bettet way to create user using accessdb command

Page 60: Vectorwise database training

Vectorwise configuration file

• [memory] max_memory_size=986.172M

Specifies the amount (in bytes) of the total memory available for query execution

By default 50 % of RAM.

min_mem_per_transaction=10M

Specifies the minimal memory reserved for each running transaction.

DEFAULT: 10M

max_overalloc=2G

Specifies the maximum amount of memory that will be overallocated

/proc/sys/vm/overcommit_memory is not 0, or

"ulimit -v" is not unlimited

DEFAULT: 2G

Page 61: Vectorwise database training

Continue......

• [system] num_cores=2 Specifies the number of cores in the system. max_transactions=69 Specifies the maximum number of active transactions in the system. vectorsize=1024 Specifies the number of records processed together. DEFAULT: 1024 [cbm] bufferpool_size=493.086M Specifies the buffer pool size in bytes (that is, disk cache). Default 25 % of RAM block_size=512KB Specifies the minimum I/O granularity. WARNING: This setting cannot be changed after database creation! DEFAULT: 512KB

Page 62: Vectorwise database training

Continue...

group_size=8 Specifies the number of blocks to handle in every disk request. WARNING: This setting cannot be changed after database creation! DEFAULT: 8 compression_lz4_enabled=false Enable or disable LZ4 compression for strings. LZ4 compression reduces data size and I/O workload but might increase processing time. [engine] max_parallelism_level=2 Defines the level of auto-parallelism. A value larger than 1 enables multi-core parallelism. DEFAULT: 1

Page 63: Vectorwise database training

Catalogs in vectorwise

• Sysmod • Run system modification after optimizing it

• This operation modifies the system tables in the database to optimize catalog access.