Is Xtdb similar to Dolt database?

Dolt is pitched as Git but for data, or a versioned MYSQL.

Since Xtdb also focuses on SQL versioning, or bitemporality, I was wondering how they compared.

I guess Dolt has special features like rollbacks and merging of databases, whereas Xtdb more generally lets you query data across time.

Since you guys have more experience with these terms and nuances I figured a comparison was worth asking about. Thanks.

Hey @ac1 - great question - thereā€™s definitely a decent amount of overlap in vision between the projects :slight_smile:

Did you read So you want a Temporal Database? by the Dolt team already? Itā€™s pretty good! It doesnā€™t reflect what weā€™re doing currently with XTDB v2 (full SQL support, cross-time querying, Apache Arrow etc.), but is broadly accurate.

My main area of disagreement is around this statement:

Note, valid time and decision time is a way to model data and is not generally applicable to all data

I donā€™t have strong opinions about ā€˜decision timeā€™ (have yet to feel the need for it), but valid time is hugely valuable, and should be applicable to all business data. Like it should really only be an option to disregard valid time IFF your data has ~no correlation with real-world / human-scale notions of time, and you will never have a need to track changes to it.

Doltā€™s DVCS-inspired notions of branching and diffing data sets is definitely cool, but Iā€™m not sure how it really corresponds with how businesses typically move data around today, at least not internally within the context of a single business.

In contrast, I think the lack of recognition & adoption of valid time (and also system-time immutability) is a massive source of complexity observable across every business in existence. It is particularly obvious wherever regulatory compliance requirements crop up.

Regardless though I wish the Dolt team every success, and undoubtedly many of the ideas theyā€™ve been pushing forward around multi-master data reconciliation/replication and forking CI builds would be great to add to XT someday.

You might also enjoy Kent Beckā€™s explanation of bitemporality: Eventual Business Consistency - by Kent Beck

Thanks for the commentary and links!

Doltā€™s DVCS-inspired notions of branching and diffing data sets is definitely cool, but Iā€™m not sure how it really corresponds with how businesses typically move data around today, at least not internally within the context of a single business.

I was thinking this would be useful for content-focused products like a CMS or a knowledge base. But Iā€™ve yet to actually try those features out.

Whereas Xtdbā€™s handling of bitemporality would be more useful where reports are required such as ecommerce or finance.

1 Like

As it happens I am very partial to CMS and knowledge base like use-cases - thatā€™s largely what inspired my passion for databases to begin with :sweat_smile:

Multi-master/p2p replication is an awesome ideal, and local-first UIs for knowledge work are arguably essential - my database rabbit hole started with CouchDB and ā€œCouchAppsā€(!)

However, bringing those concepts to life within mission critical business systems has generally always been non-trivial, risky and expensive vs. the centralized, non-distributed options.

Perhaps the two biggest practical barriers have been the assumptions that SQL databases have historically baked in around (1) mutability and (2) schema management. Dolt and XTDB both try to address those same barriers in very different ways, in terms of UX and also under-the-hood.

My own personal hope is that the (arguably simpler) data model in XTDB can facilitate creating similarly powerful CMS and knowledge base like systems atop - but right now that would mean implementing all the replication/diffing logic in userspace.

1 Like

Iā€™ve been thinking about this more, since knowledge/content management is a space Iā€™m interested in.

Being able to use first-class Clojure support with Xtdb makes the idea of building such facilities on top worth considering (as opposed to using Doltā€™s instead).

I wonder if thereā€™s facilities Xtdb lacks (and could provide) that would make that more feasible? like database forking, etc. and leave merging/review to user space. WDYT

Asking this early in case I decide to use it for such a use-case later, ha. :slight_smile:

I wonder if thereā€™s facilities Xtdb lacks (and could provide) that would make that more feasible? like database forking, etc.

Definitely forking. Doing it well probably necessitates some sort of labelling/tagging + instance ID + hashing scheme. The biggest technical hurdle is likely around making structural sharing across long-lived forks efficient - although thatā€™s potentially not a major tax / requirement if the diffs are only ever modest (MBs not GBs).

In v1 we have the with-tx API, which we do hope to recreate at some point, but real forking means making that persistent/long-lived. However maybe with-tx could be sufficient by itself for various use-cases (e.g. modest branching in CI for PRs, small diffs).

leave merging/review to user space

XTDB could help here with FDWs/multiple-DB/federated-DB queries, which would avoid pulling data into userspace to handle diffing and merging.

Incidentally I was just reading through Diffing and patching tabular data - Open Knowledge Labs - prompted by Jonathan Edwardsā€™ latest demo vid: Experiment visualizing structure diffs ā€“ Alarming Development

Also, you may want to keep an eye on Brimm: graph backend with a Notion-inspired UI (by Filip Juruković), Tue, Mar 12, 2024, 6:30 PM | Meetup ā€¦if it wasnā€™t on your radar already :slight_smile: (built on v1)

1 Like

Yeah, I realize Iā€™m underestimating complexity after reading that! Maybe itā€™s good we have such focused databases for different use-cases. :slight_smile:

Will be interesting to see Xtdb evolve and who knows maybe one day will also offer these features.

1 Like