Hey there.
I am starting a research project on temporal databases for my master’s thesis, specifically on the evaluation and comparison of current temporal database solutions.
The workload in question is expected to emulate highly connected customer data, and so capabilities such as triple indexing and storing documents/objects are valuable.
With that in mind, I was first interested in XTDB (1.0) as I noticed it supported a mix of graph and document databases. However, as I understand, XTDB 2.0 is a faster and more scalable (in the way the components are decoupled) database.
What seems to be the more suited version of XTDB for my research?
Hey @DinisSousa sorry to keep you waiting on a response! Naturally I think v2 is the best place to start, as it offers the best foundations for any future optimisations and indexes - the fact it uses Apache Arrow means we could incorporate modern research ideas much more feasibly, e.g.
- GraphAr: An Efficient Storage Scheme for Graph Data in Data Lakes (2024)
- Fast Access to Columnar, Hierarchically Nested
Data via Code Transformation (2017)
The main caveat with v2 currently is that it doesn’t support recursive CTEs (native fixpoint evaluation), but there are workarounds, per this discussion: XTDB's Sweet Spot - #6 by alex_mandel
Please keep us posted on your research - I would love to hear more!
p.s. unrelated to XTDB (other than it validates the premise of bitemporal modelling) but Graphiti/Zep might be interesting if you’re looking at ML use cases: [2501.13956] Zep: A Temporal Knowledge Graph Architecture for Agent Memory
Unless you’re heavily invested in clojure, I’d recommend beginning with 2.0. The learning curve is significantly smaller.