Deployment recommendations for small apps

jacobobryant · 16 September 2024 17:59

Had some questions prompted by this thread:

We’re hoping that Kafka and Kafka-compatible services (e.g. things like Redpanda and Warpstream) are now ubiquitous enough that we don’t need to support further pluggability in the tx log implementation. Anything S3-compatible should be sufficient for object storage.

The postgres TX log option in V1 is nice since gives XTDB full “coverage” for any size of deployment. For V2, given your current roadmap, what would be your recommendations for a hypothetical developer who’s considering building a side business on, say, digitalocean but wants to avoid starting out with their managed kafka offering (minimum $147/month)?

Just use the filesystem

Maybe just stick with the filesystem TX log until the app is large enough to warrant kafka? This line of reasoning basically. Makes sense to me. The object store would still be managed at least.

The tx log is a Write-Ahead Log which means that there will ~always be some novelty stored there which doesn’t yet exist in the object store (and the delay may be minutes or even hours).

For this hypothetical, the potential for hours of data loss is quite possibly fine. Even so, maybe that could be mitigated further:

can the filesystem tx log be backed up while the system is running?
could the filesystem tx log be streamed to s3/something for backup, similar to sqlite + litestream?
could there be a setting to force the tx log to be flushed to object storage more often, say every 5 minutes?
or is a delay of hours something that would only occur for systems under high load? Maybe for this hypothetical, the delay would likely be small anyway and there’s no need to worry about any additional backup other than having the managed object store.

self-host redpanda

My only reservation about “Just use the filesystem” would be not having an escape hatch (other than paying a lot for managed kafka) in case there is some indie-developer scenario where it really would be best to have multiple servers. I learned about Redpanda for the first time about an hour ago; seems like that could be a good option here? E.g. set up a single-node cluster (apparently 2GB of ram is the minimum, which is cheaper than a DO managed postgres instance anyway) and ideally have it stream to S3 or something for backup, same as mentioned above. probably can just use their docker image.

Maybe for both this + the file system scenario, backing up the TX log could just be done in application space. I.e. if XT V2 has a listen api, the application could use that to send TX log items somewhere.

give up

Maybe XTDB just isn’t a good fit for the solo developer use case. I don’t hold that opinion since the two options above both seem practical, but if the XTDB team ever comes to that opinion, that’s totally fine and I’d love to know sooner rather than later .

So yeah, would be interested in whatever thoughts you have on all that, anything I haven’t thought of, etc

jacobobryant · 16 September 2024 18:08

Related question: are equivalents to V1’s listen and/or open-tx-log currently on the roadmap? with V2’s transient TX log, would it still be possible to traverse over the entire “logical” TX log (I.e. by traversing both the data in the object store + the physical TX log)?

refset · 16 September 2024 19:34

First up, any databases (and software more generally) shouldn’t struggle to “scale down” if it reasonably can. So if a solo developer can’t make XTDB work then we have a problem to solve

Admittedly it’s not a priority #1 for us right now though, while we’re pre-GA and prioritising work that’s need for projects with our Design Partners (none of whom are solo developers so far).

By betting on the Kafka API we’re essentially putting the work of “how far can this scale down” onto the likes of Redpanda - so that would be my first port of call. I haven’t looked deeply into their stuff to know what’s ideal, but if pushed to make a decision right now for my own v2 solo dev project I would:

Have a XT and a single Redpanda development node on the same machine (I bet their minimum requirements can actually go much lower, fwiw)
Have a second single Redpanda development node on a separate machine, as isolated as possible (different tenant or cloud provider, even), and which replicates from that first node using some standard Kafka topic replication tool
Take periodic backups of the Redpanda storage from this second machine, onto S3

This wouldn’t be delivering HA at all, and you may lose a (configurable) few milliseconds of writes due to the replication lag, but it’s probably good enough to get your funding or grow big enough to eventually justify the increased number of machines needed for a proper prod setup. From a quick skim of your link and the Litestream docs I also don’t think is terribly different from what Litestream offers (minus the work to do periodic backups to S3).

We actually already have this, since Add stagnant log flushing process by wotbrew · Pull Request #2637 · xtdb/xtdb · GitHub

Much longer term we would like to remove Kafka as a dependency, but again, it’s not a priority for the time being. I’m actually rather hoping that Tigerbeetle might be a convenient drop in component for filling that consensus replication role by the time we get around to it (their team has strongly indicated before that they want to serve more use cases than just credits and debits!).

Yes, but not on the short term horizon though, see: Users can subscribe to a stream of completed transactions · Issue #2454 · xtdb/xtdb · GitHub (though it could do with a refresh soon) - for now you should rely on polling for changes by querying the xt.xts table (and joining against _system_from / storing your own metadata about which tables/IDs are being updated). Technically you could subscribe to the Kafka tx-log, but since it’s much more of an implementation detail in v2 (especially if plans to migrate away do happen) please don’t

Hope that helps!

jacobobryant · 16 September 2024 20:13

Thanks, that is helpful!

seancorfield · 28 November 2024 20:44

I wonder if Bufstream would be an option? Buf

jacobobryant · 29 November 2024 18:21

Looks really interesting, and the name is perfect. I poked around a bit and the main deployment instructions I could find were for AWS and GCP, and both had kubernetes as requirements. But there’s a demo for running it locally. So maybe that could be adapted. If I can get it deployed to a single VM and have it use S3 for persistence, that would be excellent.

refset · 20 February 2025 10:53

Noting the existence of Tansu - “A drop-in replacement for Apache Kafka with PostgreSQL and S3 storage engines”

This means you could potentially run v2 with just S3. Or S3 + some managed Postgres service. (+ another container, which introduces its own latency and HA complications)

Topic		Replies	Views
Questions regarding v2.0 storage architecture Users v2	9	146	25 September 2024
Kafka Transaction log Retention Users	5	276	7 August 2023
V2 with postgres storage? Users	3	288	21 May 2024
How can I test document store with new XTDB 2 installation in Azure AKS? Users v2	1	28	18 February 2025
Example of containerized XTDB nodes with Kafka transaction and document store Users	8	343	20 November 2023

Deployment recommendations for small apps

Related topics