27
past and present... gord[at]live.ca @gord_chung

Gnocchi v4 - past and present

Embed Size (px)

Citation preview

Page 1: Gnocchi v4 - past and present

past and present...

gord[at]live.ca@gord_chung

Page 2: Gnocchi v4 - past and present

v4 features

□ simplified scheduling□ less pandas, more numpy□ Redis incoming driver□ In-memory incoming Ceph

driver□ Other general features:

■ http://gnocchi.xyz/releasenotes/4.0.html

■ http://gnocchi.xyz/releasenotes/unreleased.html

Page 3: Gnocchi v4 - past and present

schedulingincoming data sharded into sacks to allow simple division of work acrossmetricd workers

Page 4: Gnocchi v4 - past and present

numpyoldPandas - a monolithic, all-in-one, data analysis toolkit

newNumpy - a lightweight, high-performance, N-dimensional array (and a bit more) library

Page 5: Gnocchi v4 - past and present

in-memorythe memory is mightier.leverage Redis driver or LevelDB/RocksDB internals for Ceph

Page 6: Gnocchi v4 - past and present

benchmarksback with another one of those block rockin’ beats

Page 7: Gnocchi v4 - past and present

v2 & v3node1

- OpenStack controller node- Ceph Monitor Service- Redis (coordination)

node2

- OpenStack Compute Node - Ceph OSD node (10 OSDs + SSD

Journal) - 18 metricd (24 in v2)

node3

- Gnocchi API (32 workers)- Ceph OSD node (10 OSDs + SSD

Journal) - 18 metricd (24 in v2)

node4

- OpenStack Compute Node- Ceph OSD node (10 OSDs + SSD

Journal)- PostgreSQL (- 18 metricd (24 in v2)

environmentv4.xnode1

- OpenStack controller node- Ceph Monitor Service- Redis- MySQL

node2

- OpenStack Compute Node - Ceph OSD node (10 OSDs + SSD

Journal)

node3

- OpenStack Compute Node - Ceph OSD node (10 OSDs + SSD

Journal)- Gnocchi API (32 workers)- 18 metricd

all nodes are physical servers:- 24CPU (48 hyperthreaded)- 256GB memory- 10K disks- 1GB network- CentOS 7.1

less services and hardware when running v4. all gnocchi services on single node

all tests use Ceph as a storage driver for aggregates.

Page 8: Gnocchi v4 - past and present

data generated using benchmark tool in client (modified to use threads). 4 clients w/ 12 threads running simultaneously.

write throughput

total datapoints written per second. (higher is better)

Page 9: Gnocchi v4 - past and present

number of requests made per second. (higher is better)

write throughput

Page 10: Gnocchi v4 - past and present

test case 11K resources, 20 metrics each. flood Gnocchi with 60 individual points per metric. 1.2M calls/run. run it a few times.

Page 11: Gnocchi v4 - past and present

time to POST 1.2M individual measures for 20K metrics to Gnocchi.

post time

v3.1 had anomaly that caused degradation over time.

Page 12: Gnocchi v4 - past and present

processing time

v4 tests use 18 metricd, v3 test uses 54 metricd

time to aggregate all measures according to policy. (lower is better)

Page 13: Gnocchi v4 - past and present

v4 only comparison

processing time

Page 14: Gnocchi v4 - past and present

processing time

number of recorded, unprocessed measures over a single run

poor scheduling logic resulted inefficient handling of many tiny objects in v3.

Page 15: Gnocchi v4 - past and present

processing time

number of recorded, unprocessed measures over a single run backlog size dependent on

both API’s ability to write data and metricd’s ability to process it.

Page 16: Gnocchi v4 - past and present

test case 21K resources, 20 metrics each. flood Gnocchi with 60 batched points per metric. 20K calls/run. run it a few times.

Page 17: Gnocchi v4 - past and present

processing time

v4 tests use 18 metricd for 3x8 aggregates/metric, v2 and v3 tests, use 72 and 54 metricd respectively

time to aggregate all measures according to policy. (lower is better)

Page 18: Gnocchi v4 - past and present

aggregation time

time to aggregate 60 measures of a metric into 3x8 aggregates(lower is better)

average time reflects a combination of scheduling efficiency, computation efficiency and IO performance.

Page 19: Gnocchi v4 - past and present

test case 3500 resources, 20 metrics each. flood Gnocchi with 720 batched points per metric. 10K calls/run. run it a few times.

Page 20: Gnocchi v4 - past and present

time to aggregate all measures according to policy. (lower is better)

processing time

v4 tests use 18 metricd for 3x8 aggregates/metric. v2 and v3 tests, use 72 metricd

Page 21: Gnocchi v4 - past and present

aggregation time

time to aggregate 720 measures of a metric into 3x8 aggregates(lower is better)

computation efficiency improved for larger series. ~3x improvement for 60 points and ~6x improvement for 720 points

Page 22: Gnocchi v4 - past and present

some more numberspeep this...

Page 23: Gnocchi v4 - past and present

time to aggregate metric with varying unbatched measure sizes (lower is better)

processing time

numbers represent optimal performance. benchmark was taken under zero load.

Page 24: Gnocchi v4 - past and present

time to retrieve a single time series using curl and client(lower is better)

query time

client overhead attributed to but not limited to formatting

no significant performance difference vs v3

Page 25: Gnocchi v4 - past and present

time to aggregate all measures according to default ‘medium’ policy. (lower is better)

default configurations

v3 tests use 54 metricd.v4 tests use 18 metricd.

- v3 medium policy:- minute/hourly/daily rollups- 8 aggregates each

- v4 medium policy: - minute/hourly rollups- 6 aggregates each

Page 26: Gnocchi v4 - past and present

thanks!Any questions?

You can find me at@gord_chunggord[at]live.ca

?

Page 27: Gnocchi v4 - past and present

Credits

Special thanks to all the people who made and released these awesome resources for free:□ Presentation template by

SlidesCarnival