Weird query behavior for a document containing a property with an empty list as its value

Hi XTDB team, I have been using XTDB for quite a long time but just found a weird query behavior regarding a document containing a property with an empty list as its value.

  1. I have a simple document as below:
data = mapOf(
            "modelInfo" to listOf<Map<String, Any>>(),
            "underlyer" to "SSR",
            "modelName" to "RISK_FREE_CURVE",
            "id" to "RISK_FREE_CURVE.SSR"
)

‘modelInfo’ has an empty list as its value.

  1. Then I save it in a XTDB:
 val doc = XtdbDocument.builder(id).putAll(data).build()
 val vt = DateTimeConverter.ZDT2Date(ZonedDateTime.now())
 val transaction = Transaction.builder().put(doc, vt).build()
 val tt = db.submitTx(transaction)
  1. I verified the document is actually persisted:
val en = db.db(vt).entity(id)
println("entity does saved in xtdb")
println(en.toMap())
  1. Then I have a few queries and the first one is simple and works:
val name = "RISK_FREE_CURVE"
val q1 = "{:find [(pull entity [*])], :where [[entity :modelName \"$name\"]]}"
db.db(vt).query(Clojure.read(q1)).flatten().map { it as IPersistentMap }.forEach { println(it) }
  1. However, the second query doesn’t work:
val q2 = "{:find [(pull entity [*])], :where [[entity :modelInfo modelInfo] [entity :modelName modelName] [entity :underlyer underlyer] [(== modelName \"${name}\")]], :in []}"
db.db(vt).query(Clojure.read(q2)).flatten().map { it as IPersistentMap }.forEach { println(it) }

This time, the query tries to find a document containing property ‘modelInfo’, ‘modelName’ and ‘underlyer’ where ‘modelName’ is ‘RISK_FREE_CURVE’. I expect the saved document would be returned but the result is empty.

Then there are two wired things:

  1. If I remove ‘[entity :modelInfo modelInfo]’ from the query, it actually works:
val q3 = "{:find [(pull entity [*])], :where [[entity :modelName modelName] [entity :underlyer underlyer] [(== modelName \"${name}\")]], :in []}"
db.db(vt).query(Clojure.read(q3)).flatten().map { it as IPersistentMap }.forEach { println(it) }

This time I am able to obtain a saved document.

  1. If I change the original document a bit by mapping ‘modelInfo’ to a non-empty list, query in step 5 works as expected:
val data2 = mapOf(
            "modelInfo" to listOf<Map<String, Any>>(emptyMap()), // the list is not empty anymore
            "underlyer" to "SSR",
            "modelName" to "RISK_FREE_CURVE2",
            "id" to "RISK_FREE_CURVE.SSR2"
    )
val q2 = "{:find [(pull entity [*])], :where [[entity :modelInfo modelInfo] [entity :modelName modelName] [entity :underlyer underlyer] [(== modelName \"${name}\")]], :in []}"
db.db(vt).query(Clojure.read(q2)).flatten().map { it as IPersistentMap }.forEach { println(it) }

This time query q2 works again …

I am testing it against a pretty early version of xtdb (1.20.0) and here’s the full repro:

  1. start a docker xtdb
docker run -p 3000:3000 juxt/xtdb-in-memory:1.20.0
  1. full repro script, although it’s Kotlin:
package xtdb.api

import clojure.java.api.Clojure
import clojure.lang.IPersistentMap
import tech.tongyu.util.DateTimeConverter
import xtdb.api.tx.Transaction
import java.time.Duration
import java.time.ZonedDateTime
import kotlin.system.exitProcess

@Suppress("UNCHECKED_CAST")
fun main() {
    val db = IXtdb.newApiClient("http://localhost:3000")
    val data = mapOf(
            "modelInfo" to listOf<Map<String, Any>>(),
            "underlyer" to "SSR",
            "modelName" to "RISK_FREE_CURVE",
            "id" to "RISK_FREE_CURVE.SSR"
    )
    val data2 = mapOf(
            "modelInfo" to listOf<Map<String, Any>>(emptyMap()),
            "underlyer" to "SSR",
            "modelName" to "RISK_FREE_CURVE2",
            "id" to "RISK_FREE_CURVE.SSR2"
    )
    test_query(db, data, "RISK_FREE_CURVE")
    test_query(db, data2, "RISK_FREE_CURVE2")
    exitProcess(0)
}

fun test_query(db: IXtdb, data: Map<String, Any>, name:String) {
    val id = data["id"] as String
    val doc = XtdbDocument.builder(id).putAll(data).build()
    val vt = DateTimeConverter.ZDT2Date(ZonedDateTime.now())
    val transaction = Transaction.builder().put(doc, vt).build()
    val tt = db.submitTx(transaction)
    db.awaitTx(tt, Duration.ofSeconds(30))
    val en = db.db(vt).entity(id)
    println("entity does saved in xtdb")
    println(en.toMap())
    val q1 = "{:find [(pull entity [*])], :where [[entity :modelName \"$name\"]]}"
    val q2 = "{:find [(pull entity [*])], :where [[entity :modelInfo modelInfo] [entity :modelName modelName] [entity :underlyer underlyer] [(== modelName \"${name}\")]], :in []}"
    val q3 = "{:find [(pull entity [*])], :where [[entity :modelName modelName] [entity :underlyer underlyer] [(== modelName \"${name}\")]], :in []}"
    println("simple where clause that works")
    db.db(vt).query(Clojure.read(q1)).flatten().map { it as IPersistentMap }.forEach { println(it) }
    println("binding doesn't work")
    db.db(vt).query(Clojure.read(q2)).flatten().map { it as IPersistentMap }.forEach { println(it) }
    println("binding works  by removing modelInfo")
    db.db(vt).query(Clojure.read(q3)).flatten().map { it as IPersistentMap }.forEach { println(it) }
}

Please let me know if there’s anything I could provide and thanks in advance for the help.

Thanks,
-BS

1 Like

Hey @blshao84 - I believe this is the expected behaviour.

Clojure vectors and sets are always ‘decomposed’ into triples before being indexed (see Datalog Transactions · XTDB Docs), and I expect the java.util.ArrayList is being interpreted as a vector. This means an empty list will not have any corresponding entry in the index and therefore your clause won’t be able to match against anything.

To workaround this I think you could:

  1. store an explicit nil instead of an empty list
  2. use a different collection type (e.g. clojure.lang.PersistentList$EmptyList a Clojure list which doesn’t get decomposed)
  3. use a second attribute to indicated that an empty list was submitted (e.g. hasEmptyModelInfo = true)

Hope that helps,

Jeremy

Thanks Jeremy! It’s good to know what’s the expected behavior and I can go with workaround 1.

1 Like