Deduplicating open-q queries

ian_sinn · 10 January 2022 02:51

Hey, in reading through the docs I cam across the following regarding the use of open-q and was confused:

Note that results are returned as bags, not sets, so you may wish to deduplicate consecutive identical result tuples (e.g. using clojure.core/dedupe or similar).

Under what circumstances will results from open-q be duplicated?
What’s the recommended deduplication strategy?
Why is there a discrepancy between xt/q and xt/open-q beyond how the results are consumed/processed?

refset · 10 January 2022 14:14

Hey @ian_sinn

Under what circumstances will results from open-q be duplicated?

For instance, consider this example:

  (with-open [n (xt/start-node {})]
    (let [query '{:find [result]
                  :where [[(range 4) [x ...]]
                          [(even? x) result]]}]
      [(xt/q (xt/db n) query)
       (with-open [i (xt/open-q (xt/db n) query)]
         (into [] (iterator-seq i)))]))
;;=>   [#{[true] [false]} [[true] [false] [true] [false]]]

What’s the recommended deduplication strategy?

You can use the usual tool belt of tricks, e.g. (into #{} ...) for simplicity, or perhaps clojure.core/dedupe if you to work with transducers

(with-open [n (xt/start-node {})]
    (let [query '{:find [result]
                  :where [[(range 4) [x ...]]
                          [(even? x) result]]}]
      (with-open [i (xt/open-q (xt/db n) query)]
        (into #{} (iterator-seq i)))))

Why is there a discrepancy between xt/q and xt/open-q

The behaviour of xt/q is “correct” in the sense that Datalog always operates (and returns results) in terms of sets. xt/open-q exposes some of the implementation detail of how Datalog queries are executed, but it also enables various opportunities for lazy consumption and implementing complex algorithms efficiently, beyond the context of single query.

Topic		Replies	Views
(V2) Best way to handle frequent updates that might not contain any changes? Users v2	13	290	12 September 2024
Some notes on xtql from december Users	4	257	15 January 2024
Stack overflow from xtdb.query/compile-sub-query calling itself Users	1	385	21 November 2022
[ANN] Early access snapshot 2024-06-18 Users release , v2	4	164	25 June 2024
Is the XTQL pipeline operator lazy? Users v2 , clojure	0	192	20 December 2023

Deduplicating open-q queries

Related topics