Several questions on 2.0

I’ve been reading the 2.x materials on the website and had some initial thoughts/questions :slight_smile:

  • Do you have a rough idea on when Pull and recursive rules will be available?

  • From the “Roadmap” section:

    Database VIEWs: gradual schema, writeable views, nested views, materialized views

    This would incredible. Could you elaborate on this and whether it is more of a ‘wish’ or ‘coming, just a ways off’ kind of a thing? :slight_smile:

  • Does this enable a better disk usage story for transacting mostly the same doc over and over?

    Column-orientation allows for advanced on-disk compression

  • Just a vague musing at this point. Tables and the query match clause take care of the common “document type” concern. But outside of transactions and queries, the type/table of a doc will still be left as a user space concern? Somehow this might connect to ‘lookup refs’ (like [<table> <id>]), which I think Jon mentioned you are thinking about, but…

  • Which leads to: Have you nailed down which data types will be valid for :xt/id?

And a comment: the new match syntax (for specifying doc fields) is neat.

2 Likes

Hey @zeitstein - first of all, thanks for taking the time to play with the EA and send us your feedback!

Not as yet. Once the Conj folks are back in the UK we’ll have a debrief, collate all the feedback we’ve received and plan out the running order in more detail for the next few months. We’ve had a fair few people ask us about pull, fewer about recursive rules - could you tell us more about how the latter would be important for your use cases?

Very much a ‘wish’ at the moment, we’ll likely be prioritising performance and reliability in the immediate short term so that we can move towards a GA release - although these things are naturally very exciting to us, and with the ‘inside-out’ architecture we feel we’ve set ourselves up well :slight_smile:

Not specifically for transacting the same doc over and over (although may well still be helpful) - the main benefits here come from Arrow’s dictionary and run-length encodings, but also that storing values from the same column together without other unrelated columns’ data is likely to be more amenable to standard compression algorithms. We’ve not done extensive work/testing on this yet though :slight_smile:

Yes, they’ll largely be a user-space concern. We are considering lookup refs - we have a similar feeling that there’s a good connection here - but we’d need to figure out exactly how they’d best fit with our table-oriented storage/queries.

These are any values that we can store in Arrow, plus a few extensions (keywords, UUIDs, URIs and sets so far).

That’s great to hear, thanks!

James

Thanks, James!

[…] recursive rules - could you tell us more about how the latter would be important for your use cases?

Working with recursive trees/graphs. One example is queries using the location in the graph as one of the conditions.

although these things are naturally very exciting to us, and with the ‘inside-out’ architecture we feel we’ve set ourselves up well :slight_smile:

Glad to hear that!

Yes, they’ll largely be a user-space concern. We are considering lookup refs - we have a similar feeling that there’s a good connection here - but we’d need to figure out exactly how they’d best fit with our table-oriented storage/queries.

Would encoding the table in id bring performance improvements? I imagine it might help with reads. For my use case, I’d be totally for using [<table> <id>] as ids. As mentioned, methods of encoding the entity type has been a frequent topic of discussion among users. It’s frequently useful knowing the type of an entity without requiring access to it. I can also see the API being simplified (e.g. not having to specify tables in :put or match).

But there are certainly counter-arguments, even from my limited perspective.

Looking forward to seeing how development unfolds. Thanks for all the hard work. Cheers!

1 Like