A question about RocksDB data files

refset · 6 September 2023 21:55

So /var/lib/xtdb is the default location for data using the docker image?

That’s right yep, as defined here https://github.com/xtdb/xtdb/blob/f7102e542c735e7db00867a3992abf913e930e34/build/docker/Dockerfile#L7

I assume I can parallelize this? Though if batch input of 30K records is fast enough I may not have to. It is kinda important to preserve sequential transaction time, but it’s more of a nice to have for my PoC, if parallelizing populates it faster.

As per my reply on the other thread Parallelizing data loading, processing large query results - XT doesn’t have an explicit mechanism for parallel import. Sequential transaction time can’t be avoided using the public APIs. That said, if you really wanted to explore advanced custom options then there are some interesting possibilities in theory, e.g. see https://rockset.com/blog/optimizing-bulk-load-in-rocksdb/ - but hopefully the default serial performance is sufficient for now.

how do I handle a query result with possibly millions of records? Some kind of laziness and partitioning seems required here. What I’m likely to do is only ask for the IDs filtered by 2 criteria, and then spit out SQL update files

See again the link I shared on open-q on the other thread

Big picture: I’m loading XTDB with all history, then loading a few SQL instances with the current state of relevant, filtered records (each one having its own filtered view of current state).

Any additional advice deeply appreciated!

Good to know, I think you’re on the right track, but will have a think whether there are any useful existing examples to consider. Let me know if I can help accelerate or unblock your evaluation somehow - I would be very happy to get on call sometime soon if that’s of interest.

Topic		Replies	Views
XTDB 2 local storage Users	3	304	3 February 2024
[ANN] XTDB 2.0.0-beta1 Users release , v2	6	133	14 September 2024
Docker container persistence [solved] Users v2	3	181	28 February 2024
Community survey! Users feedback-wanted	0	316	19 April 2023
Opinions: Separate Storage and Compute? Dev	2	788	3 January 2022

A question about RocksDB data files

Related topics