[ANN] XTDB 2.0.0-beta5

Afternoon folks - it’s another release day :rocket: 2.0.0-beta5 this time - highlights as follows:

  • “Template-friendly SQL” - if you’ve ever had to battle with the amount of commas or ANDs in your dynamically-generated SQL query, this one’s for you.
    • We’ve taken inspiration from Clojure’s “commas-as-whitespace” and, WHERE possible (I’m sorry ), you can separate your predicates and projections (in the SELECT clause) with as many or as few commas as you like (yes, commas in the WHERE clause too). If you’re using YeSQL or HugSQL, for example, this means you can end all of your dynamically-included predicates with a comma (even the first/last one) without worrying about syntax errors.
    • We’ve also brought some of the pipelining from XTQL into our SQL dialect - while you can obviously still use the top-level structure of standard SQL (i.e. your tooling all still works), we’ve also added the ability to pipeline (e.g.) aggregations. Start your query with FROM instead, then write as many WHERE, SELECT, GROUP BY etc as you like.
      • While we still support HAVING, it’s now just a WHERE that just runs after the aggregation. Some may say it always was … but you can now write FROM foo SELECT a, COUNT(*) AS a_count WHERE a_count > 10
      • calculate frequency distribution:
      -- standard SQL - subqueries
      SELECT order_count, COUNT(*) AS freq
      FROM (SELECT customer, COUNT(*) AS order_count
            FROM orders
            GROUP BY customer) counts
      GROUP BY order_count
      ORDER BY order_count DESC
      
      -- XT SQL
      FROM orders
      SELECT customer, COUNT(*) AS order_count
      SELECT order_count, COUNT(*) AS freq
      ORDER BY order_count DESC
      
      • filtering on the results of window functions:
      -- standard SQL - subqueries again
      SELECT _id
      FROM (SELECT _id, ROW_NUMBER () OVER (PARTITION BY v ORDER BY _id) 
      row_num
            FROM foo) subq
      WHERE row_num = 0
      
      -- XT
      FROM foo
      SELECT _id, ROW_NUMBER () OVER (PARTITION BY v ORDER BY _id) row_num
      WHERE row_num = 0 -- N.B. WHERE applied after SELECT
      SELECT _id
      
      • This removes a lot of the necessity for sub-queries in common cases - if you need multiple aggregations (e.g. for a frequency distribution), or you need to filter by the result of a window function (not possible in standard SQL), just include another SELECT clause.
  • ~4x faster ingestion through the Postgres wire-server. While we’ve always (and still do) recommend batching up your inserts where possible, we’ve made it (much) faster in the cases where you can’t, by reducing our per-transaction overhead.

As always, full details in the release notes, and let us know what you think in the usual channels :pray:

Cheers,

James & the XT Team

1 Like

If anyone wants to see the templating and pipelining live in action, I’ll be running through how it all works in an Office Hours session this Thursday - please come along!