34
Zabbix integration with Big Data System in large-scale environment Andy Zhou

System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Zabbix integration with Big Data System in large-scale environment

Andy Zhou

Page 2: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Company introduction

Established in 2010, Shanghai Grandage Data System Co., Ltd. is a professional IT service

company. Focusing on development of IT software and provision of IT services for organization at

different size around the world, We also offer a wide range of professional services including

visualized software design and development, IT infrastructure monitoring system architecture

design and implementation, ITOA consulting and technical solutions. Grandage is authorized by

Zabbix as exclusive distributor in Greater China.

Page 3: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

IT Consultant, IT Architect, and Trainer in Shanghai Grandage Data

System Corporation, Andy Zhou is the first Zabbix Certified Trainer in

China and has nearly 10 years of IT administration and maintenance

experience, 5 years of experience in Zabbix monitoring solution, and

long-term experience in the field of ITOM and ITOA. Andy has worked on

many IT administration and maintenance projects in China for insurance

and financial industries.

About me

Page 4: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

01 Customer introduction

Page 5: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Customer profile

1. This customer is the third largest insurance company in China.

2. The total amount of device will reach 65000+ in the future.

3. There are many objects need to monitored: Operation systems, Databases,

Middleware, Network device, Network Line, Storage device, PC Server hardware, Trap

integrate, Syslog integrate, Virtualization, Application Log file, Private cloud platform

and so on.

4. The customer are using many commercial monitoring tools: IBM Netcool, BMC

Patrol, Usight, Boya software, H3C U-Center

Page 6: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Devices type

Operation system:

Database:

Middleware:

Application:

Network device:

Storage device:

PC Server hardware:

Virtualization:

Page 7: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Pain points

1. Many monitoring tools have inconsistent rules which result in unable to achieve unified

management.

2. The systems are independent of each other and the interconnection is insufficient, may easily

result in an information island.

3. The depth of monitoring is not enough.

4. Insufficient flexibility and weak in self control.

5. Too many commercial monitoring system which causes high license fees and maintenance costs.

Page 8: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Why Zabbix?

1. Zabbix is the best open source monitoring system, no license fees

2. Zabbix installation and deployment is simple and fast

3. Zabbix is powerful and highly flexible

4. Easy to use and manage, user interface friendly

5. Zabbix backend is based on C language development, stable performance and

low resource overhead

Page 9: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Load test

Page 10: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

02 System architecture design

Page 11: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Customer environmental statistics

01

03

05

02

04

06

Average of 300 items for a host

Average 60s frequency, 19,500,000

items

Historical data kept 2 days, trend data kept 7 days

NVPS: 325,000

Data size: 4,486.04 GB

65000+ device in the future

Page 12: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Zabbix HA architecture

Add Zabbix Proxy server to scale out system architecture

Page 13: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Distributed Central Management Architecture

Network Monitoring

System (HA)

OS/DB/Middleware Monitoring

System (HA)

Storage Monitoring

System (HA)

Server Hardware

Monitoring System

(HA)

Big Data Systems Zabbix Configuration Management System

Zabbix API

JSON

JSON

JSON

JSON

APIAPIAPIAPI

Developed by us

Kafka

Page 14: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Issues

1. Frequent gaps in graphs

2. Many queues appear

1. Database query time is too long

2. Zabbix frontend page response time is too long

Page 15: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

How to optimize performance?

01Nginx replaces

Apache

Zabbix configuration

parameter optimization

MySQL database

parameter

optimization

MySQL database

partition table

configuration, one

data table per day

Operating system

kernel parameter

optimization

Use SSD high

performance storage

Monitoring item data

collection interval

optimization

Server hardware

configuration upgrade

02 03 04

05 06 07 08

Page 16: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

After optimization

After optimization, the Zabbix front-end access became faster and no queues.

Page 17: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

03 Use the valuable data

Page 18: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Data in different stage of IT management

No matter which stage, the data is

the basis for analysis and process!

The development stage of IT

operation and maintenance management:

01 ITOMUse tools to monitor

and manage IT objects

02ITOA

Data Processing, Association and

Analysis in Different Dimensions

03 AIOpsBig data analysis, machine learning,

algorithms

Page 19: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Data collection

Log data

Storage data

Middleware data

Configuration data

Database data

System data

Hardware data

Application data

Network data

Trap data

Assets data

Virtualization data Cloud data

Security device data

Network Line data

Use Zabbix to collect data

from different devices and

objects.

Zabbix Dashboard

Export JSON data

Page 20: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Data analysis

Page 21: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

04 Zabbix integrate with Big Data System

Page 22: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Elastic component

Beats is the platform for

single-purpose data shippers. They

send data from hundreds or

thousands of machines and systems

to Logstash or Elasticsearch , Here

we use the Filebeat.

Page 23: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

How it works

Zabbix Server

JSON Files

Filebeat Elasticsearch

KibanaReal time export

Logstash

Data transfer Data processingJSON Files

JSON Files

JSON Files

Data view

Data storage

Data transfer to ES directly

Page 24: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Zabbix Server

JSON Files

Filebeat Kafka cluster

Data viewReal time export

Logstash

Data transfer Data processingJSON Files

JSON Files

JSON FilesData transfer to Kafka directly

How it works

Page 25: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Data export configurationData directory and data file size in the Zabbix configuration file: Exported Json data file:

Data is exported from the Zabbix

Server to the JSON file in real time.

Page 26: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

JSON data exported from Zabbix

Page 27: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

How to transfer JSON data?

Add the data file needs to

be transferred in the Filebeat

configuration file and enable it.

Page 28: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Logstash configuration

Logstash has two required elements: input

and output, and one optional element is filter

Page 29: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Logstash to process data

We can use Logstash to filter,

convert, split, splice and format the

data transferred by filebeat.

Page 30: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Zabbix integrate with ELK

Page 31: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Zabbix integrate with ELK

Page 32: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Zabbix integrate with Kafka

Page 33: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Zabbix data displayed in Big Data System

Page 34: System in large-scale environment Zabbix integration with ... › files › zabbix_summit_2019 › Andy_Zhou-… · System Corporation, Andy Zhou is the first Zabbix Certified Trainer

Thank you!