While transforming the plan into vectors is interesting, I wish they'd gone into more detail about how the ML model prunes and filters the best plan. It is also not clear what attributes of a plan the corresponding vector encodes.
I do not know much about Databloom, but it looks like this "Learning-Based Query Optimizer" is built for specific use-cases in a Data engineering/analytics setting(like K-means as cited in the article). It might not be a replacement for optimizers in traditional Databases.
> not clear what attributes of a plan the corresponding vector encodes
Fig 5, page 4 from [1]:
> Topology Features
> Operator Features
> Data Movement Features
> Dataset Features
For a single logical plan, meaning it will vary in its length for another query. (which is a part I don't get: you learn a new model per query? Can you learn with a variable feature length?)