Milestones Continuous Workbench Misc Custom Infrastructure Scenarios 11B Benchmarks Request Password

b10b mongodb

Tue, Dec 5, 2017

Configuration

Benchmark suite
b10b
Classifier
171201-1259_3nuxeo-c4.2xlarge_4mongo-m4.xlarge_3es-c4.2xlarge
Distribution
http://community.nuxeo.com/static/snapshots/nuxeo-server-tomcat-9.10-SNAPSHOT.zip
Date
2017-12-05 12:21:15
Backend
mongodb
Nuxeo nodes
3 * c4.2xlarge
Database nodes
4 * m4.xlarge
Elasticsearch nodes
3 * c4.2xlarge
Number of documents
100000000

Overview

header

The goal is to benchmark a medium volume of 100 million documents on a Nuxeo 9.10-SNAPSHOT and then run the classical Gatling benchmark on it.

The database is a MongoDB sharded cluster using 4 nodes with a total of 16 CPUs.

Elasticsearch 5.6.4 is setup with a cluster of 3 nodes with a total of 24 CPUs.

Kafka 1.0.0 is setup in a single node with 4 CPUs.

Nuxeo 9.10-SNAPSHOT (2017-12-03) in cluster mode with 3 nodes with a total of 24 CPUs.

Nuxeo is tuned to use the StreamWorkManager based on Kafka, Redis is used for dbs invalidation.

The ansible scripts that run this benchmark are available and described here: https://github.com/bdelbosc/b10bpoc/tree/Benchmark-100m-Nuxeo-9.10-SNAP

Mass import

The import is done using 3 Nuxeo nodes. It uses the new nuxeo-importer-stream that generates random source documents and put them into Kafka.

It has taken 1h25min to reach 100m documents, so an average of 19607 docs/s.

Reporting this to the number of MongoDB vCPU it gives 1225 docs/s per vCPU.

The MongoDB storage size is 52.8GiB (including 19.4GiB for the indexes) over 4 nodes, the average document size is 1058 bytes.

import steps

import rate

Elasticsearch indexing

Indexing has been run in one shot, during 1:43min, so an average of 15576 docs/s.

Reporting this to the number of Elasticsearch vCPU it gives 649 docs/s per vCPU.

indexing rate

The Elasticsearch storage size is 117GiB on 3 nodes with a total of 12 shards without replication.

The settings are slightly adapted: index.refresh_interval is set to 30s to reduce disk IO, html_strip is removed from the fulltext analyzer because of its high CPU cost and it brings nothing for text data. Following is a flame graph showing that lots of time is spend in CustomAnalyzer::initReader that loop on the char filter set by html_strip:

html_strip_cost

Gatling tests

The first run reveals some regressions, fixes are in progress this step will be replayed soon.

Monitoring

Some Grafana snapshots:

Gatling benchmark

Import
Create
Read
Search
Update
CRUD
31.3 48
14 20
96.7 158
44.2 40
32 33

Details

Mass import

This simulation uses the nuxeo-importer over the loaded instance.
Throughput sync
184.3
Documents imported
100000
Total duration
550.8
Duration sync
549706
Residual duration async
960
Reports
» Overview and monitoring » Gatling report

Create document using REST

This simulation creates new documents without attachement on top of the existing instance.
Throughput sync
2347.6
Response time
31.3 48
min:4
p50:20
p95:78
Document created
90341
Error
0
Concurrency
88
Duration sync
38.5
Residual duration async
1.1
Reports
» Overview and monitoring » Gatling report

Read using REST

This simulation reads random documents and folders with different kind of metadata.
Throughput
3934.5
Response time
14 20
min:0
p50:8
p95:43
Requests
716125
Error
0
Concurrency
88
Duration
182
Reports
» Overview and monitoring » Gatling report

Navigation using JSF

This simulation uses the JSF interface to navigate on folder and document tabs.
Throughput
47.3
Response time
166.8 130
min:0
p50:149
p95:333
Requests
8606
Error
0
Concurrency
8
Duration
181.9
Reports
» Overview and monitoring » Gatling report

Search using REST

This simulation performs random NXQL searches relying on Elasticsearch.
Throughput
894.2
Response time
96.7 158
min:8
p50:87
p95:150
Requests
162775
Error
0
Concurrency
88
Duration
182
Reports
» Overview and monitoring » Gatling report

Update REST

This simulation performs concurrent update of documents metadata using REST.
Throughput sync
1914.9
Response time
44.2 40
min:5
p50:36
p95:91
Requests
348643
Error
0
Concurrency
88
Duration sync
182.1
Residual duration async
7.3
Reports
» Overview and monitoring » Gatling report

CRUD on documents in REST

This simulation performs concurrent Create Read Update and Delete operations on document using REST.
Throughput sync
2392.4
Response time
32 33
min:0
p50:25
p95:78
Delete response time
52.2 39
min:0
p50:45
p95:103
Requests
302092
Error
3.36
Concurrency
88
Duration sync
122
Residual duration async
191.2
Reports
» Overview and monitoring » Gatling report

Benchmark mixing actions

This simulation performs concurrently JSF navigation and CRUD in REST.
Throughtput
101.6
Response time
21.1 53
min:0
p50:4
p95:141
Requests
18489
Error
0
Concurrency
95
Duration
181.9
Reports
» Overview and monitoring » Gatling report