Questions regarding v2.0 storage architecture

Hey @noelkurian great to hear from you!

Yes, that’s the case for now. We’re hoping that Kafka and Kafka-compatible services (e.g. things like Redpanda and Warpstream) are now ubiquitous enough that we don’t need to support further pluggability in the tx log implementation. Anything S3-compatible should be sufficient for object storage.

The tx log is a Write-Ahead Log which means that there will ~always be some novelty stored there which doesn’t yet exist in the object store (and the delay may be minutes or even hours). This means it is essential for the tx log to be as durable as possible (backed up routinely etc.), otherwise you risk breaking the wider strong consistency and durability guarantees that XTDB offers.

We aren’t seeking to pursue such a course currently but we’re always happy to entertain such conversations, especially if the problems XTDB is solving are valuable enough :slight_smile:

Yes MinIO should work great. We’ve not done any testing explicitly yet, but would be happy to assist.

The file-based tx log is intended for development only and is not suitable for connecting multiple nodes (even if you can technically make it work with shared filesystems).

Theoretically yes, you should be able to migrate tx logs (local → kafka, or kafka A to kafka B), but please discuss this with us directly before factoring it into your project timelines. I suspect it may already ‘just work’ if you simply pause new submissions to the tx log and wait for the log to be flushed (there’s a configurable background process and schedule) - but that would certainly require the system to be unavailable for new writes for a non-trivial amount of time. A cleaner solution should be possible in future.

Thanks for the questions :pray:

Jeremy