V2 AWS cluster performance expectations?

Given how early we are in the product cycle for v2, is it a fair question to ask about performance expectations of XTDB in the cloud (AWS)?

I’m experimenting with the AWS cluster I stood up and seeding it with the test data we use locally for dev/test – about 1,000 member profiles.

Pushing 1,000 profiles, one at a time, into a local Docker XTDB instance takes about 8-9 seconds.

Pushing 1,000 profiles, one at a time, into XTDB @ AWS takes about 3 minutes.

Thoughts?

Hey Sean - even if it’s not a fair question, it’s no doubt one many others will also ask :slightly_smiling_face:

Short answer is yes, it’s still very early in the product cycle - our aim with this initial early access release has very much been to get something out that’s performance enough for people to play with, and get a feeling for what it feels like to work with it.

That said:

  • We’ve certainly seen (much) faster than this in the past - we’ve been looking at a good multiple faster than XT1, especially on ingest - but we’ve not had a performance bash in a while, so could well be a regression.
  • Assuming you’re inserting with SQL, this currently involves taking a watermark of the database to run the SQL query. If you can, I’d try inserting with put operations instead - these don’t need to read the database at all, and hence are significantly quicker. (We plan to put in a fast-path for SQL insert which will do the same, as well as further optimising the snapshotting)
  • If you’re inserting one-at-a-time, are you waiting for the transaction to submit and be indexed before submitting the next one? If you need to do that, some degree of parallelism would likely help to offset the latency. (If you can fire and forget using the async variant, that should also improve matters.)
  • On Kafka, this will likely be worse, because we poll at 100ms intervals by default (IIRC) - ok for batches and parallel submission, but a fair amount of latency if you’re running in serial.
  • More generally, we’ve not optimised the OLTP side as much as the OLAP yet (which, again, is showing promising early results). Specifically, we haven’t paid as much attention to per-batch, per-tx or per-query overheads as we have to actually churning through the data, but with OLTP these form a much bigger proportion of the overall runtime - there’s quite a lot of low hanging fruit here.

tl;dr is that there’s still lots of room for improvement we’re aware of here, and probably even more opportunities that we’re not yet - but performance will be a large proportion of our work this year.

HTH :slightly_smiling_face:

James

I’ll wait for the fast-path insert to drop before I get too worried about performance – but I take it from that SQL update will also read the DB and then put the updated record? In which case, we probably want to try to keep the initial select data around and then do a put (or insert :slight_smile: ) directly instead of update – since we nearly always read first then update.

Nope. Just plain SQL insert and onto the next one.

We’re looking to substantially refactor a bunch of code around DB access so we’ll bear in mind that async put will be something look at.

Our current apps have pretty high write volume, so it sounds like we’ll really want to do as much of the writing to XTDB as possible in a separate thread pool?

We have a lot of refactoring to do in preparation for any real XTDB trial so this is all useful information to feed into that.