Recommended approach for ordering docs by transaction?

I’m trying to make use XT as a ledger / event log and I’m wondering if there’s a recommended way to lazily consume the log in insertion order.

Reading some issues and other threads I have this:

;; See https://github.com/xtdb/xtdb/issues/267#issuecomment-944260371

(defn get-created-inst
  "Get 'creation time' of a record according to its earliest valid time."
  [db eid]
  (with-open [h (xt/open-entity-history db eid :asc)]
    (-> (iterator-seq h) first ::xt/valid-time)))

(let [query '{:find [?e ?created-at] 
              :where [[?e :xt/id]
                      [(my.xtdb/get-created-inst $ ?e) ?created-at]]
              :order-by [[?created-at :asc]]}]
  (with-open [cursor (xt/open-q (xt/db node) query)]
    (doseq [tuple (iterator-seq cursor)]
      (println "tuple:" tuple)))) ;; Consume or process each tuple here

Is there a recommended approach for this type of usage? Reading this post made me think there might be a more direct solution when using XT as an immutable ledger.

Hi @ian_sinn have you seen the open-tx-log API? Depending on what exactly you want to do with it, there are some low-level transformations you might need though, as per `open-tx-log` and bus tx events only contain `#crux/id`s · Issue #1343 · xtdb/xtdb · GitHub

I wouldn’t recommend attempting to use open-tx-log inside of a query though, since it is not a locally cached lookup (each call hits the backend tx-log directly).

Note that open-entity-history can also be used independently, outside of a query, in case that suits you better.

Great, thanks for the links @refset!

My hope was to lazily consume a (potentially very long) sequence of documents in the order they were written.

I initially passed over open-tx-log because I only wanted documents of a certain “type” (such as [?e :event/type :write-log] and I assumed open-tx-log would provide all transactions for all documents. But I’ll read through that issue you sent. Thanks for the help.

So, unless I’m mistaken a viable approach seems to be:

(with-open [tx-log (xt/open-tx-log node nil true)]
  (doseq [tx (iterator-seq tx-log)
          :let [ops (::xt/tx-ops tx)]
          :when (some meet-criteria? ops)]
    (println ops)))

Which would consume the log from start to finish and only act if some of the operations pass a predicate.

Just to confirm, is the transaction log guaranteed to be ordered from oldest to newest?

A second question: Since this is the tx-log and not the document store, is it possible to run into transactions for documents that have not yet been persisted?

Thanks again for the help.

To clarify (even though I see you already have this figured out based on your follow-up!) - that is the case for open-tx-log - you would still need to process & filter it for the documents / attribute-values of interest.

unless I’m mistaken a viable approach seems to be […]

Yep, that looks about right :+1:

is the transaction log guaranteed to be ordered from oldest to newest?

Only in regards transaction time, which is want I gathered you were looking for based on your description of wanting the “log in insertion order”. Valid time ordering is available at a per-entity level (not globally) via entity-history (or open-entity-history).

is it possible to run into transactions for documents that have not yet been persisted?

Assuming you have a properly configured document store (i.e. which actually writes to disk / something with strong durability guarantees), then no, because writes to the tx-log only occur after all the writes to the doc-store return successfully.

Thanks again for the help.

:pray: