Accidental bi-temporal correction

Hello,

I implemented a test where I run 10 threads at the same time.

Each of those threads submit a call to a xt/fn of mine

This transaction function ::xt/put a document with a constant xt/id (so, the same document is ::xt/put 10 times)

Most of the 10 transactions happen in the same millisecond (I could figured out by inspecting the transaction time)

I don’t specify a valid time, so transaction time is used as valid time.

When I open-tx-log I can see my 10 transactions.

When I ask for entity-history I can only see 2 or 3 versions of the document.

When I ask with entity-history :with-corrections? true I can see the 10 versions.

I suspect that because transactions happened in the same ms, xtdb handled my transactions as temporal correction. But that was not my intention as a user of the library.

Is it a wanted behavior ? should I report an issue ?

Suggestion:

  • increase the time precision to nanosecond
  • consider using strictly < to check if a date is in the past (equal time must be considered in the future)

I can live without any fix, this is an extreme test case :slight_smile:

2 Likes

Hey @jprudent - thanks for the detailed write-up! I can confirm that this is the expected behaviour based on XT’s native millisecond resolution for valid date (internally represented by 64-bit UTC Dates/Longs).

Suggestion:

  • increase the time precision to nanosecond

We have a related issue open for this already Microsecond resolution for valid-time? · Issue #895 · xtdb/xtdb · GitHub (microseconds, not nanoseconds though) - and the workaround suggested in my response there may even apply for your requirements too, where you could coerce and re-scale the valid-time dimension as needed in userspace.

  • consider using strictly < to check if a date is in the past (equal time must be considered in the future)

This kind of change would be fairly major so is unlikely to be feasible. However, given that we rely on “equal time” to express corrections, I’m not sure how corrections could be expressed/deduced in this alternative model. Perhaps there could be an internal correction-id (incrementing integer) and is-correction? option/arity on the API(s) :thinking:

Hope that helps. In the meantime, I will have a think about how to document the existing behaviour more explicitly! :slight_smile:

Hello @refset , thanks for your response and the links !

A quick precision, for the sake of the discussion :

Technicality aside, I perceive time as continuous, not discrete, so 2 distinct transactions can never happen in the same time unless it’s the same transaction, like 2 real numbers are distinct from each other unless they are the same. In my test case (that is not supported by a business need) millisecond precision is too weak to approximate the continuity of time. Microsecond would practically be ok for my case (not much room for “time collisions”), but there will always be someone to point out the weakness of the time model.

I like the HLC solution you propose because in my case, it’s not about having an exact time, but rather a notion of “happen-before”. So we got an arbitrary time precision (ms) WITH ordering.

I don’t think the api should expose an is-correction?. This seems a weak fix to the time model.

Thanks

1 Like