Do you have published data on Performance metrics?

I’m basically pondering these questions.

  • Time to write to document store + transaction log ?
  • Can indexer keep up with transaction writes ?
  • What’s the indexer performance ?

Or broken down.

  • writes: What’s the max docs/sec (per doc size)
  • indexer: What’s the latency of indexer @ write max docs/sec
  • reads: What’s the read performance per graph shape (triangle, suare, star, etc)

I can see the Performance benchmark suites here. But are results published anywhere?
https://docs.xtdb.com/resources/performance/

Hey @twashing thanks for your questions!

I would be happy to share some ballpark numbers but a lot of variables have to be factored in beyond raw bytes/sec of the various components or even underlying hardware/network (application-dependent knowledge of document shapes, complexity of constraints, match write contention, etc.). As ever with these things its safest to measure and extrapolate for the given use case.

Using our typical benchmark setup (m5.xlarge with default RocksDB configs) during TPC-H bulk loading we consistently observe >50K AVs (Attribute-Value pairs) per second, which equates to double-digit MB/s. Note that this is really a measure of indexing throughput using a local disk, because that is ordinarily always the bottleneck (not the writes to the tx-log). Write latency will normally be much more network sensitive and depend on the levels of HA/durability/distribution you need, but it shouldn’t really change based on throughput workload unless you always need transactional (synchronous, logical) confirmation of those writes using await-tx/tx-committed?…in which case there would be some degradation whilst RocksDB’s LSM tree does its thing and you may want to tune RocksDB appropriately.

Read performance is also highly workload dependent but LMDB will almost always offer the best read performance (>3x RocksDB). XTDB is not (currently) engineered to be particularly efficient at TPC-H shaped analytical workloads, but for WatDiv-style cyclic queries with uniformly randomised distributions the performance is typically better than what most OLTP (transactional) engines can offer, although nothing is absolute and XT won’t always pick the optimal join order (which is usually what dominates real-world performance).

are results published anywhere?

Not currently, but I’d be happy to help you run the benchmarks yourself, or I can share some recent samples with you if you’d like to find out more: jdt@juxt.pro