XTDB's Sweet Spot

XT currently does provide something that few - if any - other databases do. a sweet spot between a document database, with bitemporality, and graph traversal.

@xlfe I’m forking this bit from the SSaC thread (hope that’s okay). I’m curious which aspects of XT’s design and behaviour are most important to you? We often run into folks who are happy to have a sane document store, true bitemporal queries, or graph traversal … but it’s rare to hear someone champion all three at once. :slight_smile:

Put another way: if you weren’t using XTDB, is there a contender that closely approximates the features you need most?

This is an open question to the crowd, obviously. We’d love to know what you want/need out of a database in 2022.

We migrated from datomic, and at first the document vs entity differences seemed significant, but in hindsight any loss of fidelity from not having a (very strong) schema has been easily outweighed by the fact that xtdb open source, and therefore, inspectable and extendable.

I have spent much time browsing the source to understand how xtdb works, and am using two modules of my own (vector search using ngt as a secondary index and the avisi datastore module). none of that is possible in (eg datomic), or it would be more work to acheive with higher fragility (ie bolting on separate systems).

Aside from that, which features are most important? hard to say, but off the top of my head these three really stand out

  1. (almost) covering indexes - important for arbitrary graph traversal
  2. the ability to perform pulls on documents (we don’t use much datalog but make extensive use of pull expressions)
  3. to a lesser extent, immutability / bitemporality (nice to have but we could probably get away with out it for our use case).
1 Like

If xtdb didn’t exist, well. We’d probably be on datomic or postgres! But neither of these approximate both the features xtdb offers and it’s openness/extensibility. Datomic has the best coverage of features (but is not open or extensible) and postgres, while open, lacks much of what makes datomic and xtdb great (clojure!).

4 Likes

Sorry to revive this old thread but we actually use ArangoDB as a bitemporal graph database by utilising an approach suggested by them: Time Traveling with Graph Databases: Insights but we are currently looking at xtdb as an alternative because ArangoDB has many problem when used in bi-temporal mode.

Hey @alex_mandel thanks for chiming in and adding the data point. How would you describe your domain? e.g. Are you doing network analysis over large amounts of connected data? Or are you more working with a lot of semi-structured data in a more CRUD-like use case?

We are working with a lot of semi structured data which is highly connected a. We want to be able to do network analysis at some point as we are working with supply chain data primarily. A good example would be being able to traverse supplier and sku relationships. For example we have documents and updates that can link to every node in our graph, sometimes we want to traverse them and get all suppliers, skus and documents that are mentioned and updated by documents and updates.

1 Like

Thanks, that’s interesting, and it sounds like XTDB v2 may be a good fit for the domain. Let us know how we can help :slightly_smiling_face:

To be able to convince my other teammates to try out XTDB I need two things, show some demos on how you can do simple traversals like bfs or dfs nothing fancy. And I need to prove them that xtdb won’t die in the next 6-12 months :D, this might sound a bit strange but as a startup of 3 people we don’t have the resources to switch databases often, so we would like to stick with one solution for the foreseeable future. Also is this a good place to discuss this or I should start a new thread?

Here is okay still - probably a good place to discuss the desire/need for graph functionality more generally anyway :slightly_smiling_face:

I’m not fully in context to reason about it right now, but I believe bread-first search (BFS) more or less respects how SQL executes anyway, and without recursive CTEs it can generally be achieved using multiple explicit CTEs (one per depth layer), e.g. play

However, for now though I think depth-first search (DFS) using v2 would be blocked on the lack of recursive CTE support: Support CTEs, §7.13 · Issue #2087 · xtdb/xtdb · GitHub (both internally and in the SQL grammar) - though you can of course run multiple queries, and system_time at least gives you a stable/consistent basis for doing so (even if it’s not as efficient as pushing it down into the database properly)

The example works fine. In the case when you have recursive CTE would you be able to have a single table which stores all relationships (from/to) for multiple tables and query that a bit like in datalog everything is in the same table and you can just traverse it?

I am asking this because of how we are structuring things currently as having something like that would make migration easier for us.

Yes exactly, you can model a lot of things with a single table :slightly_smiling_face: And for your use case, do you consider these to be analytical / long-running/async queries?