What happens when I have kafka configured both as transaction log and document store?
We have this config from long time ago and I want to know if we need to change it.
Is it in-memory or does it request the data from kafka every time?
How does it add data to it? Does it submit when I write into the node? Or should we instead push data to kafka ourselves?
The doc Kafka · XTDB Docs says that it could be due to historical reasons. But what happens if I use it this way?
It doesn’t say why it should be changed. If at all.
What happens when I have kafka configured both as transaction log and document store?
In general it means that you should benefit from low-latency commits when calling submit-tx, assuming Kafka is configured appropriately.
We have this config from long time ago and I want to know if we need to change it.
We definitely still support the usage of Kafka as a document log - so I don’t think you need to do anything in particular. If you would like to send me your specific config though I would be happy to confirm that (jdt@juxt.pro).
Is it in-memory or does it request the data from kafka every time?
The Kafka log cannot be used for ad-hoc lookups, so XT has to materialise a Key-Value store from this log, which happens locally for each node. By default, the Key-Value store is in-memory however you can configure a persisted local-document-store backed by RocksDB/LMDB - see https://docs.xtdb.com/storage/kafka/#_kafka_as_a_document_store. The risk of leaving this local-document-store as in-memory (the implicit default) is that you will hit OOMs unless you definitely have enough memory to fit the entire data set.
The reason for the guidance in the docs (“we’d now recommend a different document store implementation”) is that maintaining a full copy of the document store on each node could be quite expensive and is avoidable with the other options - where documents can be ~cheaply retrieved (and cached) on-demand.