Stack overflow from xtdb.query/compile-sub-query calling itself

I’ve encountered a bug for which I haven’t yet been able to make a small reproducing example. Basically some rules when used in some queries cause this stack trace:

(many repetitions of the first two lines omitted)
                 xtdb.query/compile-sub-query      query.clj: 1581
              xtdb.query/compile-sub-query/fn      query.clj: 1582
                 xtdb.query/compile-sub-query      query.clj: 1519
                      xtdb.query/expand-rules      query.clj: 1448
                            clojure.core/into       core.clj: 6962
                       clojure.core/transduce       core.clj: 6947
                  clojure.core.protocols/fn/G  protocols.clj:   13
                    clojure.core.protocols/fn  protocols.clj:   75
            clojure.core.protocols/seq-reduce  protocols.clj:   24
                             clojure.core/seq       core.clj:  139
                                          ...
              xtdb.query/expand-rules/iter/fn      query.clj: 1384
           xtdb.query/expand-rules/iter/fn/fn      query.clj: 1437
                                          ...
   xtdb.query/expand-rules/iter/fn/fn/iter/fn      query.clj: 1425
xtdb.query/expand-rules/iter/fn/fn/iter/fn/fn      query.clj: 1434
                clojure.walk/postwalk-replace       walk.clj:  118
                        clojure.walk/postwalk       walk.clj:   53
                            clojure.walk/walk       walk.clj:   50
                            clojure.core/into       core.clj: 6958
                          clojure.core/reduce       core.clj: 6886
                  clojure.core.protocols/fn/G  protocols.clj:   13
                    clojure.core.protocols/fn  protocols.clj:   75
            clojure.core.protocols/seq-reduce  protocols.clj:   24
                             clojure.core/seq       core.clj:  139
                                          ...
                          clojure.core/map/fn       core.clj: 2770
                      clojure.core/partial/fn       core.clj: 2641
                        clojure.walk/postwalk       walk.clj:   53
                            clojure.walk/walk       walk.clj:   46
                      clojure.core/partial/fn       core.clj: 2641
                        clojure.walk/postwalk       walk.clj:   53
                            clojure.walk/walk       walk.clj:   50
                            clojure.core/into       core.clj: 6958
                          clojure.core/reduce       core.clj: 6886
                  clojure.core.protocols/fn/G  protocols.clj:   13
                    clojure.core.protocols/fn  protocols.clj:   75
            clojure.core.protocols/seq-reduce  protocols.clj:   24
                             clojure.core/seq       core.clj:  139
                                          ...
                          clojure.core/map/fn       core.clj: 2772
                      clojure.core/partial/fn       core.clj: 2641
                        clojure.walk/postwalk       walk.clj:   53
                            clojure.walk/walk       walk.clj:   46
                      clojure.core/partial/fn       core.clj: 2641
                        clojure.walk/postwalk       walk.clj:   53
                            clojure.walk/walk       walk.clj:   50
                            clojure.core/into       core.clj: 6958
                          clojure.core/reduce       core.clj: 6886
                  clojure.core.protocols/fn/G  protocols.clj:   13
                    clojure.core.protocols/fn  protocols.clj:   75
            clojure.core.protocols/seq-reduce  protocols.clj:   24
                             clojure.core/seq       core.clj:  139
                                          ...
                          clojure.core/map/fn       core.clj: 2772
                      clojure.core/partial/fn       core.clj: 2641
                        clojure.walk/postwalk       walk.clj:   53
                            clojure.walk/walk       walk.clj:   46
                      clojure.core/partial/fn       core.clj: 2641
                        clojure.walk/postwalk       walk.clj:   53
                            clojure.walk/walk       walk.clj:   51
             clojure.walk/postwalk-replace/fn       walk.clj:  124
                       clojure.core/contains?       core.clj: 1506
                                          ...
java.lang.StackOverflowError:

I haven’t yet been able to figure out which combination of rule and query causes the bug to appear. i.e. using a rule that works in one query in a slightly different one fails. Also running a query that fails against my local rocksdb backed dev database against an empty in-memory db succeeds (returns an empty result without this stack trace).

I’m not sure what the best approach for debugging this further is.

1 Like

Hi, I don’t think we’ve seen this kind of planning-time stack overflow in the wild so far. I expect it will be hard to investigate without an example offending query to hand though, so that’s probably the place to focus efforts first.

Mutually recursive rules should work correctly, e.g. xtdb/query_test.clj at f415b666f48e112271c7eeac3c4224cd7144999a · xtdb/xtdb · GitHub

Are you able to log the queries somewhere prior to running (i.e. calling q) and correlate after the fact?