How to configure RocksDB

Hi team,

This is my db.edn config:

{:xtdb.jdbc/connection-pool {:dialect {:xtdb/module xtdb.jdbc.psql/->dialect},
:db-spec {:jdbcUrl "jdbc:postgresql://localhost:5432/"}},
:xtdb.rocksdb/block-cache {:xtdb/module xtdb.rocksdb/->lru-block-cache
			    :cache-size 536870912},
 :xtdb/index-store {:kv-store {:xtdb/module xtdb.rocksdb/->kv-store
                               :db-dir "/var/tmp/txs"
                               :block-cache :xtdb.rocksdb/block-cache}},
:xtdb/document-store {:xtdb/module xtdb.jdbc/->document-store,
:connection-pool :xtdb.jdbc/connection-pool},
:xtdb/tx-log {:xtdb/module xtdb.jdbc/->tx-log,
:connection-pool :xtdb.jdbc/connection-pool}}

I find from here, RocksDB · XTDB Docs, that I can pass RocksDB options here. Do you have any example?

The reason I’d like to config RocksDB is that I find when I rebuild my index, it only takes 2 or 3 threads. I guess this might be configured through RocksDB?

Thanks,
-BS

Hey @blshao84 the default RocksDB module configuration will use n-CPUs-1 max background jobs (for compaction & flushing) - you can see the option is passed through here. You can override it by creating your own DBOptions object and passing it in as another key-value argument for your edn module (alongside :db-dir - see) however than means no longer using a plain edn file because the DBOptions must be constructed programmatically.

In theory though the (.availableProcessors (Runtime/getRuntime)) calculation should be sufficient to consume more than just 2 or 3 threads if you have more cores available to the JVM. Perhaps there’s another JVM setting you need to adjust :thinking: I suppose the questions are: what does (.availableProcessors (Runtime/getRuntime)) (or equivalent in Java) return for you? What is your JVM config and how many cores do you have? :slightly_smiling_face:

Thanks Jeremy~

I have double check that my JVM returns the actual number of cores, in my case 32, for availableProcessors.

I also find that not only rebuilding index is “slow” ( in terms of parallelism), I also encountered some bottleneck for query. For example, in my current set up, I have a XTDB node running on a 32 core virtual machine ( there’s no other job running). If I send 100 query requests one by one, each on average took 0.12 sec. However, when I send 100 requests in parallel, on average each query took about 2 sec. I did see from the log that 32 requests are received and processed in parallel and it seemed it was just the db().query() took longer in terms of parallelism.

It’s understandable that db().query() might lead to some IO which slows everything down, but how can I confirm it’s indeed the IO that is the bottleneck?

I see RocksDB has a metrics module, but how to use it? For example, after I configure the metrics, where am I supposed to look for them?


f

I have double check that my JVM returns the actual number of cores, in my case 32, for availableProcessors.

Interesting, thanks for confirming. Not sure how the configuration calculation could be wrong in that case. I should confirm, which XT version are you using?

If I send 100 query requests one by one, each on average took 0.12 sec. However, when I send 100 requests in parallel, on average each query took about 2 sec

I suspect that’s because it’s creating 100 underlying RocksDB snapshots vs ~1 (that gets re-used). Are you attempting to re-use the same db across threads? You will likely need to measure & tune for the appropriate level of parallelism here.

I see RocksDB has a metrics module, but how to use it? For example, after I configure the metrics, where am I supposed to look for them?

They should be available alongside the other metrics provided (by default) by the xtdb-metrics module - see Monitoring · XTDB Docs

I indeed create a ‘db(vt,tt)’ instance for each query and in my previous example that’s 100 although they are the same.

Is it recommended to reuse db snapshot as much as possible?

Thanks

-BS

Jeremy Taylor via Discuss XTDB <notifications@xtdb.discoursemail.com>äșŽ2024ćčŽ7月23æ—„ 摹äșŒ01:34ć†™é“ïŒš

I tried re-use existing db snapshot by caching using vt and tt. However, it doesn’t help much. I find most of time spent is actually ‘query’ method below:

    override suspend fun query(xtdbQuery: XTDBQuery, vt: ZonedDateTime, tt: ZonedDateTime): List<DalDocument> {
        ...
        val db = db(vt, tt) // db(vt,tt) will return a cached db snapshot, if any
        val t1 = System.currentTimeMillis()
        val dbResults = db.query(Clojure.read(queryString), *xtdbQuery.bindings.toTypedArray()).flatten()
        val t2 = System.currentTimeMillis()
        val docs = dbResults.map {
            XtdbDocument.factory(it as IPersistentMap).toDal()
        }
        val t3 = System.currentTimeMillis()
        logger().info("querying vt=$vt,tt=$tt: ${t2 - t1} ms, converting: ${t3 - t2} ms")
        return docs
    }

Any thought?

Within a single thread, yep. And you can use open-db for even more performance (this re-uses even more low level resources across requests) - see Clojure · XTDB Docs

This could well be measuring too coarsely, it’s hard to judge. If you can generate a flamegraph (e.g. with yourkit) you should be able to confirm for sure though.

I run more experiments, here’s the callgraph:

It looks like IO is the bottleneck since rocksdb spent most of the time which increased in high parallelism and it reduced the total throughput albeit increased cores.

In this case, anything I can configure to tune RocksDB a bit?

Thanks,
-BS

1 Like

I also tried open-db, but it seems that it can’t be shared across multiple threads at the same time? It crashed my JVM when serving multiple requests (In my test, I create a single IXtdbSource globally and never close it)

Thanks,
-BS

Thanks for the screenshots. I agree Rocks is looking like the bottleneck here. Are you using a local SSD? Is it the fastest SSD you can access? That is the best place to start. Beyond that though there are many configuration options for RocksDB - in theory the defaults should be good enough, but in case you haven’t seen it before
 RocksDB Tuning Guide · facebook/rocksdb Wiki · GitHub

The other angle of approach is to understand the query plan - if XT has picked a poor plan then the algorithmic complexity could be the actual bottleneck rather than RocksDB. Can you share an example of a query? There’s also a xtdb.query/query-plan-for function you can call to get debug information about the calculated join order: xtdb/core/src/xtdb/query.clj at 06132bc1fad63944d144d79a90053b05d78a41c8 · xtdb/xtdb · GitHub

I tested with a reasonably fast SSD (on MacBook Pro M2 lol), but saw similar results.

Here’s our query:
‘’’log
2024-07-26 10:56:05.813 [eventLoopGroupProxy-4-1] INFO t.t.d.client.XTDBClient w/interface - entity-query*{“query”:{“query”:{“sections”:[{“@class”:“xtdb.api.query.domain.FindSection”,“clauses”:[{“@class”:“xtdb.api.query.domain.Pull”,“symbol”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“GenStructure”},“spec”:{“items”:[{“@class”:“xtdb.api.query.domain.PullSpec$ALL”}]}}]},{“@class”:“xtdb.api.query.domain.WhereSection”,“clauses”:[{“@class”:“xtdb.api.query.domain.HasKeyEqualTo”,“document”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“GenStructure”},“key”:{“@class”:“xtdb.api.query.domain.Keyword”,“name”:“tech.tongyu.data.model.GenStructure.context”},“value”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“context”}},{“@class”:“xtdb.api.query.domain.HasKeyEqualTo”,“document”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“GenStructure”},“key”:{“@class”:“xtdb.api.query.domain.Keyword”,“name”:“tech.tongyu.data.model.GenStructure.modelData”},“value”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“modelData”}},{“@class”:“xtdb.api.query.domain.HasKeyEqualTo”,“document”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“GenStructure”},“key”:{“@class”:“xtdb.api.query.domain.Keyword”,“name”:“tech.tongyu.data.model.GenStructure.modelInfo”},“value”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“modelInfo”}},{“@class”:“xtdb.api.query.domain.HasKeyEqualTo”,“document”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“GenStructure”},“key”:{“@class”:“xtdb.api.query.domain.Keyword”,“name”:“tech.tongyu.data.model.GenStructure.modelName”},“value”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“modelName”}},{“@class”:“xtdb.api.query.domain.HasKeyEqualTo”,“document”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“GenStructure”},“key”:{“@class”:“xtdb.api.query.domain.Keyword”,“name”:“tech.tongyu.data.model.GenStructure.modelType”},“value”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“modelType”}},{“@class”:“xtdb.api.query.domain.HasKeyEqualTo”,“document”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“GenStructure”},“key”:{“@class”:“xtdb.api.query.domain.Keyword”,“name”:“tech.tongyu.data.model.GenStructure.underlyer”},“value”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“underlyer”}},{“@class”:“xtdb.api.query.domain.HasKeyEqualTo”,“document”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“GenStructure”},“key”:{“@class”:“xtdb.api.query.domain.Keyword”,“name”:“xt/id”},“value”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“xt/id”}},{“@class”:“xtdb.api.query.domain.Predicate”,“type”:“EQ”,“i”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“xt/id”},“j”:{“@class”:“xtdb.api.query.domain.PlainValue”,“value”:“tech.tongyu.data.model.GenStructure.EOD_CONTEXT.TRADER_VOL.VOL_SURFACE.600309.SH”}}]},{“@class”:“xtdb.api.query.domain.BindSection”,“clauses”:[{“@class”:“xtdb.api.query.domain.CollectionBind”,“symbol”:{“@class”:“xtdb.api.query.domain.Symbol”,“name”:“modelName”}}]}]},“bindings”:[[“FX_VOL”,“TRADER_VOL”,“RISK_FREE_CURVE”,“DIVIDEND_CURVE”,“CORRELATION_MATRIX”]]},“vt”:“2024-06-30T19:00:00+08:00”,“tt”:“2024-06-30T10:12:45.688+08:00”}
2024-07-26 10:56:05.822 [eventLoopGroupProxy-4-1] DEBUG xtdb.query - :query {:find [(pull GenStructure [*])], :where [[GenStructure :tech.tongyu.data.model.GenStructure.context context] [GenStructure :tech.tongyu.data.model.GenStructure.modelData modelData] [GenStructure :tech.tongyu.data.model.GenStructure.modelInfo modelInfo] [GenStructure :tech.tongyu.data.model.GenStructure.modelName modelName] [GenStructure :tech.tongyu.data.model.GenStructure.modelType modelType] [GenStructure :tech.tongyu.data.model.GenStructure.underlyer underlyer] [GenStructure :xt/id xt/id] [(== xt/id “tech.tongyu.data.model.GenStructure.EOD_CONTEXT.TRADER_VOL.VOL_SURFACE.600309.SH”)]], :in {:bindings [[:collection [modelName 
]]]}}
2024-07-26 10:56:05.826 [eventLoopGroupProxy-4-1] DEBUG xtdb.query - :triple-joins-var->cardinality {modelName 0.0079500817227403, GenStructure 1047.9502002488548, xt/id 29.699224709972576, context 1.7976931348623157E308}
2024-07-26 10:56:05.827 [eventLoopGroupProxy-4-1] DEBUG xtdb.query - :triple-clause-var-order [modelName GenStructure xt/id context]
2024-07-26 10:56:05.829 [eventLoopGroupProxy-4-1] DEBUG xtdb.query - :join-order :ave modelName GenStructure {:e GenStructure, :a :tech.tongyu.data.model.GenStructure.modelName, :v modelName}
2024-07-26 10:56:05.829 [eventLoopGroupProxy-4-1] DEBUG xtdb.query - :join-order :aev GenStructure xt/id {:e GenStructure, :a :crux.db/id, :v xt/id}
2024-07-26 10:56:05.829 [eventLoopGroupProxy-4-1] DEBUG xtdb.query - :join-order :aev GenStructure context {:e GenStructure, :a :tech.tongyu.data.model.GenStructure.context, :v context}
2024-07-26 10:56:06.091 [eventLoopGroupProxy-4-1] DEBUG xtdb.memory - :pool-allocation-stats {:allocated 393216, :deallocated 0, :in-use 393216}
2024-07-26 10:56:07.427 [eventLoopGroupProxy-4-1] DEBUG xtdb.memory - :pool-allocation-stats {:allocated 524288, :deallocated 0, :in-use 524288}
2024-07-26 10:56:07.477 [eventLoopGroupProxy-4-1] DEBUG xtdb.memory - :pool-allocation-stats {:allocated 655360, :deallocated 0, :in-use 655360}
2024-07-26 10:56:07.528 [xtdb.io.cleaner-thread] DEBUG xtdb.memory - :pool-allocation-stats {:allocated 655360, :deallocated 131072, :in-use 524288}
2024-07-26 10:56:07.528 [xtdb.io.cleaner-thread] DEBUG xtdb.memory - :pool-allocation-stats {:allocated 655360, :deallocated 262144, :in-use 393216}
2024-07-26 10:56:09.071 [eventLoopGroupProxy-4-1] INFO ktor.application - 200 OK: POST - /_xtdb/query in 3360ms
‘’’

Thanks for the logs, however if you are able to call xtdb.query/query-plan-for directly before running the query that would give one extra useful piece of debug information (:vars-in-join-order) that is unfortunately not included here in the standard debug logging.

If it’s not too difficult to attempt, please could you try directly splicing the modelName inputs into the query (i.e. not treat it as a single paramterised query that gets cached, but instead it will be run as many different queries) rather than using :in parameters? I wonder whether that might be causing/contributing to the slowness, e.g. per Improve :in query join ordering · Issue #1447 · xtdb/xtdb · GitHub

Thanks Jeremy ~

Do you mean I turn ‘modelName’ into a ‘or’ clause like this:

{:find [(pull GenStructure [*])], :where [[GenStructure :tech.tongyu.data.model.GenStructure.context context] [GenStructure :tech.tongyu.data.model.GenStructure.modelData modelData] [GenStructure :tech.tongyu.data.model.GenStructure.modelInfo modelInfo] [GenStructure :tech.tongyu.data.model.GenStructure.modelName modelName] [GenStructure :tech.tongyu.data.model.GenStructure.modelType modelType] [GenStructure :tech.tongyu.data.model.GenStructure.underlyer underlyer] [GenStructure :xt/id xt/id] [(== xt/id "tech.tongyu.data.model.GenStructure.EOD_CONTEXT.TRADER_VOL.VOL_SURFACE.600309.SH")] (or [(== modelName "FX_VOL")] [(== modelName "TRADER_VOL")] [(== modelName "RISK_FREE_CURVE")] [(== modelName "DIVIDEND_CURVE")] [(== modelName "CORRELATION_MATRIX")])], :in []}

However, this is even much slower (even for single thread).

Am I getting something wrong?

Regarding to calling ‘query-plan-for’, it looks like there’s no Java API but I manage to copy the query into clojure and use clojure API directly for both queries ( one is using ‘in’ and the other is using ‘or’)

  • OR
(def or-query '{
	:find [(pull GenStructure [*])],
	:where [
		[GenStructure :tech.tongyu.data.model.GenStructure.context context] 
		[GenStructure :tech.tongyu.data.model.GenStructure.modelData modelData] 
		[GenStructure :tech.tongyu.data.model.GenStructure.modelInfo modelInfo] 
		[GenStructure :tech.tongyu.data.model.GenStructure.modelName modelName] 
		[GenStructure :tech.tongyu.data.model.GenStructure.modelType modelType] 
		[GenStructure :tech.tongyu.data.model.GenStructure.underlyer underlyer] 
		[GenStructure :xt/id xt/id] 
		[(== xt/id "tech.tongyu.data.model.GenStructure.EOD_CONTEXT.TRADER_VOL.VOL_SURFACE.600309.SH")] 
		(or [(== modelName "FX_VOL")] 
				[(== modelName "TRADER_VOL")] 
				[(== modelName "RISK_FREE_CURVE")] 
				[(== modelName "DIVIDEND_CURVE")] 
				[(== modelName "CORRELATION_MATRIX")]
		)
	],
	:in []
})

 (q/query-plan-for (xt/db node) or-query )
{:depth->constraints
 [nil
  [#object[xtdb.query$built_in_unification_pred$unification_constraint__13426 0x1c3455e5 "xtdb.query$built_in_unification_pred$unification_constraint__13426@1c3455e5"]]
  [#object[xtdb.query$eval13328$fn__13330$pred_get_attr_constraint__13336 0x2ed0f3ec "xtdb.query$eval13328$fn__13330$pred_get_attr_constraint__13336@2ed0f3ec"]
   #object[xtdb.query$eval13328$fn__13330$pred_get_attr_constraint__13336 0x64ea9235 "xtdb.query$eval13328$fn__13330$pred_get_attr_constraint__13336@64ea9235"]
   #object[xtdb.query$eval13328$fn__13330$pred_get_attr_constraint__13336 0x6af12899 "xtdb.query$eval13328$fn__13330$pred_get_attr_constraint__13336@6af12899"]
   #object[xtdb.query$eval13328$fn__13330$pred_get_attr_constraint__13336 0x54d4c35f "xtdb.query$eval13328$fn__13330$pred_get_attr_constraint__13336@54d4c35f"]]
  [#object[xtdb.query$build_or_constraints$iter__13572__13576$fn__13577$fn__13578$or_constraint__13585 0x7db205ba "xtdb.query$build_or_constraints$iter__13572__13576$fn__13577$fn__13578$or_constraint__13585@7db205ba"]]
  nil nil nil nil nil]
 
 :var->range-constraints {}
 
 :var->logic-var-range-constraint-fns {}
 
 :vars-in-join-order
 [xt/id GenStructure modelName modelType underlyer modelInfo modelData context]
 
 :var->joins
 {modelName [{:id triple18885
              :idx-fn #object[xtdb.query$__GT_binary_index_fn$fn__12943 0x4df95b03 "xtdb.query$__GT_binary_index_fn$fn__12943@4df95b03"]}]
  GenStructure [{:id triple18885
                 :idx-fn #object[xtdb.query$__GT_binary_index_fn$fn__12943 0x4df95b03 "xtdb.query$__GT_binary_index_fn$fn__12943@4df95b03"]}
                {:id triple18886
                 :idx-fn #object[xtdb.query$__GT_binary_index_fn$fn__12943 0x2eaf17c5 "xtdb.query$__GT_binary_index_fn$fn__12943@2eaf17c5"]}
                {:id triple18887
                 :idx-fn #object[xtdb.query$__GT_binary_index_fn$fn__12943 0x5c13e774 "xtdb.query$__GT_binary_index_fn$fn__12943@5c13e774"]}]
  xt/id [{:id triple18886
          :idx-fn #object[xtdb.query$__GT_binary_index_fn$fn__12943 0x2eaf17c5 "xtdb.query$__GT_binary_index_fn$fn__12943@2eaf17c5"]}]
  context [{:id triple18887
            :idx-fn #object[xtdb.query$__GT_binary_index_fn$fn__12943 0x5c13e774 "xtdb.query$__GT_binary_index_fn$fn__12943@5c13e774"]}]
  modelData [{:id pred-return18888
              :idx-fn #object[xtdb.query$pred_joins$fn__12987$fn__12992 0x31e1a699 "xtdb.query$pred_joins$fn__12987$fn__12992@31e1a699"]}]
  modelInfo [{:id pred-return18889
              :idx-fn #object[xtdb.query$pred_joins$fn__12987$fn__12992 0x36c7c2e5 "xtdb.query$pred_joins$fn__12987$fn__12992@36c7c2e5"]}]
  modelType [{:id pred-return18890
              :idx-fn #object[xtdb.query$pred_joins$fn__12987$fn__12992 0x6080eac7 "xtdb.query$pred_joins$fn__12987$fn__12992@6080eac7"]}]
  underlyer [{:id pred-return18891
              :idx-fn #object[xtdb.query$pred_joins$fn__12987$fn__12992 0x236b89e4 "xtdb.query$pred_joins$fn__12987$fn__12992@236b89e4"]}]}
 
 :var->bindings
 {xt/id #xtdb.query.VarBinding{:result-index 0}
  GenStructure #xtdb.query.VarBinding{:result-index 1}
  modelName #xtdb.query.VarBinding{:result-index 2}
  modelType #xtdb.query.VarBinding{:result-index 3}
  underlyer #xtdb.query.VarBinding{:result-index 4}
  modelInfo #xtdb.query.VarBinding{:result-index 5}
  modelData #xtdb.query.VarBinding{:result-index 6}
  context #xtdb.query.VarBinding{:result-index 7}}
 
 :in-bindings []}
  • IN
(def in-query '{
	:find [(pull GenStructure [*])],
	:where [
		[GenStructure :tech.tongyu.data.model.GenStructure.context context] 
		[GenStructure :tech.tongyu.data.model.GenStructure.modelData modelData] 
		[GenStructure :tech.tongyu.data.model.GenStructure.modelInfo modelInfo] 
		[GenStructure :tech.tongyu.data.model.GenStructure.modelName modelName] 
		[GenStructure :tech.tongyu.data.model.GenStructure.modelType modelType] 
		[GenStructure :tech.tongyu.data.model.GenStructure.underlyer underlyer] 
		[GenStructure :xt/id xt/id] 
		[(== xt/id "tech.tongyu.data.model.GenStructure.EOD_CONTEXT.TRADER_VOL.VOL_SURFACE.600309.SH")]
	],
	:in [[modelName ...]]
})

(def model-names ["FX_VOL" "TRADER_VOL" "RISK_FREE_CURVE" "DIVIDEND_CURVE" "CORRELATION_MATRIX"])

 (q/query-plan-for (xt/db node) in-query model-names)

However it gives me an error:

user=>  (q/query-plan-for db in-query model-names)
Execution error (AssertionError) at xtdb.query/->approx-in-var-cardinalities (query.clj:629).
Assert failed: [1 5]
(= (count in-bindings) (count in-args))

If I add a [] to model-names, query-plan-for shows some results.

{:depth->constraints 
 [nil 
  nil 
  [#object[xtdb.query$eval13322$fn__13324$pred_get_attr_constraint__13330 0x5ef4161a "xtdb.query$eval13322$fn__13324$pred_get_attr_constraint__13330@5ef4161a"]
   #object[xtdb.query$eval13322$fn__13324$pred_get_attr_constraint__13330 0x1325a2ba "xtdb.query$eval13322$fn__13324$pred_get_attr_constraint__13330@1325a2ba"]
   #object[xtdb.query$eval13322$fn__13324$pred_get_attr_constraint__13330 0x2980f2f1 "xtdb.query$eval13322$fn__13324$pred_get_attr_constraint__13330@2980f2f1"]
   #object[xtdb.query$eval13322$fn__13324$pred_get_attr_constraint__13330 0x1f9d6cb3 "xtdb.query$eval13322$fn__13324$pred_get_attr_constraint__13330@1f9d6cb3"]]
  [#object[xtdb.query$built_in_unification_pred$unification_constraint__13420 0x61a7ed9c "xtdb.query$built_in_unification_pred$unification_constraint__13420@61a7ed9c"]]
  nil nil nil nil nil]
 
 :var->range-constraints {}
 
 :var->logic-var-range-constraint-fns {}
 
 :vars-in-join-order 
 [modelName GenStructure xt/id modelType underlyer modelInfo modelData context]
 
 :var->joins 
 {modelName [{:id triple18896
              :idx-fn #object[xtdb.query$__GT_binary_index_fn$fn__12937 0x5fa9a360 "xtdb.query$__GT_binary_index_fn$fn__12937@5fa9a360"]}
             {:id in18899
              :idx-fn #object[xtdb.query$in_joins$fn__12969$fn__12973 0x7110c51e "xtdb.query$in_joins$fn__12969$fn__12973@7110c51e"]}]
  GenStructure [{:id triple18896
                 :idx-fn #object[xtdb.query$__GT_binary_index_fn$fn__12937 0x5fa9a360 "xtdb.query$__GT_binary_index_fn$fn__12937@5fa9a360"]}
                {:id triple18897
                 :idx-fn #object[xtdb.query$__GT_binary_index_fn$fn__12937 0x51cd77b "xtdb.query$__GT_binary_index_fn$fn__12937@51cd77b"]}
                {:id triple18898
                 :idx-fn #object[xtdb.query$__GT_binary_index_fn$fn__12937 0x5b891149 "xtdb.query$__GT_binary_index_fn$fn__12937@5b891149"]}]
  xt/id [{:id triple18897
          :idx-fn #object[xtdb.query$__GT_binary_index_fn$fn__12937 0x51cd77b "xtdb.query$__GT_binary_index_fn$fn__12937@51cd77b"]}]
  context [{:id triple18898
            :idx-fn #object[xtdb.query$__GT_binary_index_fn$fn__12937 0x5b891149 "xtdb.query$__GT_binary_index_fn$fn__12937@5b891149"]}]
  modelData [{:id pred-return18900
              :idx-fn #object[xtdb.query$pred_joins$fn__12981$fn__12986 0x49290bfb "xtdb.query$pred_joins$fn__12981$fn__12986@49290bfb"]}]
  modelInfo [{:id pred-return18901
              :idx-fn #object[xtdb.query$pred_joins$fn__12981$fn__12986 0x3e660ff5 "xtdb.query$pred_joins$fn__12981$fn__12986@3e660ff5"]}]
  modelType [{:id pred-return18902
              :idx-fn #object[xtdb.query$pred_joins$fn__12981$fn__12986 0x5d22a04d "xtdb.query$pred_joins$fn__12981$fn__12986@5d22a04d"]}]
  underlyer [{:id pred-return18903
              :idx-fn #object[xtdb.query$pred_joins$fn__12981$fn__12986 0x4e32e1f9 "xtdb.query$pred_joins$fn__12981$fn__12986@4e32e1f9"]}]}
 
 :var->bindings 
 {modelName #xtdb.query.VarBinding{:result-index 0}
  GenStructure #xtdb.query.VarBinding{:result-index 1}
  xt/id #xtdb.query.VarBinding{:result-index 2}
  modelType #xtdb.query.VarBinding{:result-index 3}
  underlyer #xtdb.query.VarBinding{:result-index 4}
  modelInfo #xtdb.query.VarBinding{:result-index 5}
  modelData #xtdb.query.VarBinding{:result-index 6}
  context #xtdb.query.VarBinding{:result-index 7}}
 
 :in-bindings 
 [{:idx-id in18899
   :bind-type :collection
   :tuple-idxs-in-join-order [0]}]}

Thanks for the help

Thanks in turn for persevering with getting query-plan-for working - it does I think confirm what I suspected, that modelName is being resolved ahead of xt/id even though modelName is in reality much less selective (or at least I’m guessing so
hard to be 100% sure without seeing the data :slightly_smiling_face:).

The workaround using or should in theory also be something the engine could detect and optimise, but currently that will be naively executed as a fully materialised subquery (+ nested loop join) - so I am not surprised that is slower. Instead though you can try to have a single clause using == with a set literal like:

{:find [(pull GenStructure [*])],
 :where [[GenStructure :tech.tongyu.data.model.GenStructure.context context]
         [GenStructure :tech.tongyu.data.model.GenStructure.modelData modelData]
         [GenStructure :tech.tongyu.data.model.GenStructure.modelInfo modelInfo]
         [GenStructure :tech.tongyu.data.model.GenStructure.modelName modelName]
         [GenStructure :tech.tongyu.data.model.GenStructure.modelType modelType]
         [GenStructure :tech.tongyu.data.model.GenStructure.underlyer underlyer]
         [GenStructure :xt/id xt/id] [(== xt/id "tech.tongyu.data.model.GenStructure.EOD_CONTEXT.TRADER_VOL.VOL_SURFACE.600309.SH")]
         [(== modelName #{"FX_VOL" "TRADER_VOL" "RISK_FREE_CURVE" "DIVIDEND_CURVE" "CORRELATION_MATRIX"})]],
 :in []}