Sadly, I wasn’t able to see the fiddles you linked to
Weird, I ran into the same issue with those links as well…
Hopefully this SQL fiddle works. Was running into issues using the “Save as URL” with XTQL examples and the resulting query being munged.
Therefore, the transactions + queries for Clojure could look like:
(def document
{:xt/id 1
:name "Pendleton Electric Bicycle"
:price 340})
(doseq [d (repeat 3 document)]
; If the document "d" already had the :hash member
; it'd need to be filtered out
(let [hash (hash-unordered-coll d)]
(xt/submit-tx node
[[:assert-not-exists
'(from :products [{:xt/id $id, :hash $hash}])
{:id (:xt/id d), :hash hash}]
[:put-docs :products (assoc d :hash hash)]
(xt/q node
'(-> (from :products
{:bind [xt/id xt/valid-from name price hash]
:for-valid-time :all-time})
(order-by xt/valid-from))
Checking my assumptions: you’re talking about a Clojure transaction function?
Yes, that is what I was intending to point toward.
Not sure if this is what you meant, but I might kick the tires on a function that accepts a full doc, then inserts (doesn’t exist), updates (hash-unordered-coll
doesn’t match), or returns false. (hash-unordered-coll
matches)
That seems like it would obviate the need for a dedicated “hash” data member?
I think so! Something along the lines of:
[:put-fn :put-when-distinct
'(fn [x]
(let [d (first (q (from :products [{:xt/id $id} *])
{:args {:id (:xt/id x)}}))
d' (or d {})]
(if-not (= (hash-unordered-coll d') (hash-unordered-coll x))
[:put-docs :products x]
false)))]
[:call :put-when-distinct {:xt/id 1, :name "..."}]
Haven’t run the above, but it seems like it gets the example across
Unless I’m missing something, there are currently no optimizations for such cases, and the full document would be stored every time? (which seems less than ideal!!)
I may be missing something as well, so maybe one of the team members from Juxt may need to hop in here. That said, I am under the assumption that columnar oriented formats can be efficient when it comes to space, especially in the immutable case. Hypothetically, if the data doesn’t change, then my intuition is only metdata around system/valid time will be created/updated, preventing the need for a new values to be stored. I may be wrong though, I have a hunch there is some nuance here given the HTAP messaging. In the case a new document is created/serialized, some sort of computed hash and updating only when need be seems to be a good direction!