Some notes on xtql from december

g’day

i have looked at december edition of v2. You have done good job on xtql, documentation and all, much further into proper language.

i could not try for real via http with transit+json from outside (python) - both /query and /tx yielded some server errors (and that after quite some reverse engineering of what exactly to send over).

But i did study the xtql… and tried to wrap/represent it in grammar-like structures in python. i like the composability achieved although there are some quirks to tune.

Biggest problem for me is that there is still quite some Clojure-implementation leaking into the language. Mostly the “keyword” notion, usage of mapping and sequences , positional vs (lack of) named parameters.

For example, the Unnest and With, both (should) have grammar of simple kind:

operation ::= ( NameSpec+ ) ;
Namespec ::= Name | Name_Expr_Pair ;
Name_Expr_Pair ::= { Name : Expr }

Whether those Names are treated semantically in the particular usage as variable-names, column-names, or other-names, is irrelevant on grammar level. And there should be no difference syntactically - especially if the two different meanings are never used within same operation, so there is no need to distinguish them on this level at all.

so there should be no difference in {:somename 123 } vs {somename 123 } or even {‘somename’ 123 }, or even even [‘somename’ 123] - all are a pair of key_name+value_expr . It is the overarching operation (unify- or pipeline-) that should impose treating those names as needed for itself. (Also another less important: is a lone Name essentially same i.e shortcut to a pair like {Name Name} ?. But that’s hair-splitting).

on Mapping and sequences… and parameters in general. The fact that Clojure (and its interminent data protocols, EDN and transit-json) treats comma as whitespace, makes reading maps somewhat awkward. All items look same, cannot easily separate key-value pairs - one has to count items, odd ones being keys, even ones being values. When keys are :keywords that might be fine, unless the values are also :keywords , Or When keys are symbols, and values are symbols - all looks same.

Similarly but in other way, in the case/cond function, the value:result pairs are a mapping but are not represented as such. Could be made separate somehow. And the default-expr should be named somehow. Now there is one long list/sequence of positional arguments, and which is what, is not understandable. (case a b c d e f) and (cond a b c d e) are both valid things, go figure which is what.

One of the things that SQL does achieve, is being verbose enough to name things, so each item/variable can be easily deduced what it is from the operator-names around it. It may be much more verbose than needed, and maybe with weird “perspectives” here and there, but still… is not relying much on positions, colons, bracket-kinds, and similar stuff to mean (guessed) unnamed-but-important things.

Would be good if there are more named semantics in xtql , and less “assumed-from-punctuation/position” guessing.
e.g. use “pipeline” as word instead of “->”. .

my 2 cents
ciao
svil

1 Like

Hey @svil thanks for all the feedback! We will have a new JSON-LD (i.e. not transit) endpoint in the works for the end of month which should fully bypass most of these concerns, and will be a much stronger basis for working with XTQL from Python :slight_smile:

Unlike the edn (Clojure) representation, the JSON is much more verbose/explicit/obvious-to-non-Clojurists. Although using a proper SDK with a language-native builder API will undoubtedly be preferable to writing JSON directly - JS and Java SDKs are coming first. Stay tuned!

i don’t mind transit+json, i do mind the whatever server-expected structures not been described anywhere :slight_smile:

more on the language “ergonomy” and predictability:

  • as i said, use full words instead of abbreviations and punctuality-shortcuts,
    or make those aliases, e.g “pipeline” instead of ->, getattr or
    getfield or similar instead of (. x y) , “query” instead of q, etc.
    Avoid those “?” in funcnames or varnames… Maybe only ±*/ ><= are
    okay, everyone understands them same way
  • if a thing is plural, name it plural-y ; e.g. :args is named okay,
    but “:bind” is not - should be :binds
  • make it clear where maps should be singular {:a 1}, and where can be
    multiplar {:a 1, :b 2 ,…}
  • make it very clear where nil or empty-containers can be used (and what they mean then). Right now Expr includes it directly, which probably isn’t expected/allowed everywhere ;and return, where, with, aggregate etc may have zero items
  • probably many other small things that cut into everyday experience :slight_smile:

Just imagine, one day the XTQL maybe implemented at another server
over another server-side-language, maybe not bitemporal at all… just
a nicer alternative to SQL/mongo/whatever.
To me current clojure is kind-a transport/representation as well -
transporting the intent.

have fun

1 Like

Few more musings:

Which multiples-of-things are at-least one, and which can be zero?
e.g.

  • are (with ) or (without ) valid/ Make sense? probably yes, changes nothing
  • is (return ) valid/ makes sense? probably not
  • is (aggregate ) valid/makes sense ? no idea
  • is (where ) valid/ makes sense? Would it filter out everything, or will let through everything? Different languages/libraries have different idea of what Boolean_And_of_empty_sequence_of_booleans is… Best is to avoid this ambiguous case , i.e. forbid a “where” without any predicates, make it (where Pred+)
  • ( rel something ) i.e. without binds? there is example of that in the docs, a literal, but the docs says [bind+] i.e. at least one

Another thing…
if a pipeline containing just a source, is equivalent to just that source… then the root of grammar/language can always be a pipeline (and not implicit pipeline-or-source), Having a Required source and 0-or-more tails. And might be named “query” instead - as used all over docs, and inside subquery or join.

ciao

1 Like

Hey @svil :wave:

Which multiples-of-things are at-least one, and which can be zero?

The intent is that everything that can accept zero arguments, does - this is because, when you’re generating a query (say, if your user can specify a list of filters, and they choose not to specify any), you don’t want to have to handle the edge case specifically.

(Incidentally, the lack of this behaviour is one of the reasons why generating SQL query strings can be a pain!)

e.g.

  • are (with ) or (without ) valid/ Make sense? probably yes, changes nothing

Yes, these are purely additive, so zero operands makes sense here.

  • is (return ) valid/ makes sense? probably not

It is valid from an algebraic point-of-view - a projection of zero columns returns a result with the same number of rows as its input, but no columns. e.g. [{}, {}, {}, {}]. I doubt this’ll be used often, but it’s important to be algebraically consistent here.

  • is (aggregate ) valid/makes sense ? no idea

This one is less straightforward. Adding another operand to an aggregate either adds a column to group by, or a function to aggregate - so removing a grouping column means less grouping, removing a function means fewer columns in the output. Going to the extreme, removing all the grouping columns would mean no grouping; removing all the functions would mean no columns - so I’d say (similarly to (return)), (aggregate) should return one {} for every row in the input.

  • is (where ) valid/ makes sense? Would it filter out everything, or will let through everything? Different languages/libraries have different idea of what Boolean_And_of_empty_sequence_of_booleans is… Best is to avoid this ambiguous case , i.e. forbid a “where” without any predicates, make it (where Pred+)

where is a boolean conjunction (and) - the identity/zero element of which is true, so it lets everything through. This certainly isn’t mathematically ambiguous, at least :slight_smile: (I haven’t personally come across any languages/libraries where ‘and’ with 0-args returns anything else?)

  • ( rel something ) i.e. without binds? there is example of that in the docs, a literal, but the docs says [bind+] i.e. at least one

This could arguably be [bind*] - again, would behave like an empty projection.

Cheers, HTH, and thanks for all your feedback and hacking on a Python client :slight_smile:

James