Scaling out and going to production with XTDB

Hello!

I’ve chosen XTDB 1.x for a service that will gradually be rolled out in production next quarter. I believe I don’t need to list all the great features that made XTDB the obvious choice, but most of all I’d say it’s fun and enjoyable to use – in itself a perfectly valid reason to use it :slight_smile:

Project context

For various cultural reasons and limitations, currently this service runs on one instance, the document and transaction store are backed by one RDS instance. We are in AWS cloud, this service runs in EKS / Kubernetes. We have no index store, no checkpoint store. Using any AWS services other than databases may be allowed but requires non-trivial persuasion skills.

I’m wondering if anyone could give some feedback from real-life experience:

  • For better availability and larger scale:
    • I want to have three instances, not only one. After every DB write we synchronously block until the transaction has been committed.
      • Is it the right way to prevent any XTDB node to get out of sync?
      • Am I right to understand that XTDB with JDBC loads everything in memory to rebuild the index store? As data grows, I guess it might become an issue – is it the same behaviour with, say LMDB or RocksDB?
    • Also, I don’t think I’ll need anything like an RDS proxy but do we know if XTDB can work with it?
  • For better durability and less operational overhead I’d like to use AWS Aurora, any experience with using it for XTDB? Would you foresee any issue?
  • For disaster recovery I’d like to use Aurora with the MySQL dialect, and connect to it over JDBC. Aurora/MySQL enables cross-region replication. The documentation makes no difference between MySQL and PostgreSQL, any quirks to know or is it fully transparent?
  • For faster startup and queries I’d like to persist the index store and add a checkpoint store one shared NFS volume mounted on each instance. Do I understand correctly that these stores may be shared accross instances, or does it risk any data corruption?

Of course I will try out and test things, and will report – but hearing any feedback beforehand would definitely be highly appreciated. Also, if XTDB 2.x were recommended for production would be happy to think about how to migrate to it.

Best,
Piotr

1 Like

Hey @piotr-yuxuan thanks for sharing some of the background with us! And for the kind words/feedback :smiling_face:

submit-tx is blocking in itself (note there is also a submit-tx-async API should you ever want it) - but this only blocks to ensure that the physical writes to the doc-store and tx-log are successful. To make sure a transaction is logically committed you need to use await-tx on a relevant node along with tx-committed?. The transaction processing is deterministic, so every node should arrive at the same conclusion (as long as your transaction functions are pure and deterministic, i.e. don’t do anything non-deterministic like generating UUIDs or making network calls!).

The index-store uses an in-memory (on heap) KV store by default but you almost certainly want LMDB or RocksDB for production otherwise you’ll eventually hit OOM.

XT should work with various JDBC proxies fine, although I would always recommend testing extensively.

I’ve not used Aurora personally so can’t comment on the tradeoffs particularly but agree it is valuable option to consider for many orgs. Certainly Aurora should work in principle, although I have heard a report of some strange behaviour that hasn’t been diagnosed further, see tx-time should only monotonically increase alongside tx-id (JDBC/Postgres) · Issue #1917 · xtdb/xtdb · GitHub - this is likely due to Aurora’s (custom) internal synchronisation / consensus mechanisms, but we’ve not confirmed that yet. It may also be limited to Postgres and work fine with MySQL compatibility (the report only relates to Postgres).

I have nothing concrete to add beyond the above - I’ve not used Aurora enough to understand the real impact of its design - but I would expect it to add non-trivial latency however (probably also limiting peak throughput). With any distributed single-writer architecture, like Aurora (IIUC), there’s an unavoidable tension between durability, availability, and peak performance (and cost…) – ideally the tradeoffs infrastructure can be adjusted dynamically, but I am not sure how many knobs Aurora offers.

We would really value hearing any stories (good or bad!) but if there’s anything I/we can do to help guide & think in the meantime…please don’t hesitate to ask :slight_smile: