I built a recommended like this at a prior job. We carefully tested it against the original “naive” algorithm which was used direct user behavior clustering.
What was interesting is that the naive algorithm got better over time and the incremental benefit of our new code got smaller.
Why?
Because the training data for the naive algo included user behavior from the new one. As we created better recommendations, users clicked on them and that fed into the old algo!
Coming to your product: what is to prevent a customer from using it for a few weeks, copying down the results, and then using those recommendations forever?
They’ll get most of the benefit for very small cost.
Hey! While we're mainly looking at the product metadata, our algorithm also learns user-specific patterns for each store. That makes the recommendations hard to copy - since two people could look at the same product but get different recommendations. Also, what to recommend to a particular product can depend on more factors than only person & product - seasonality & trends also influence (to varying degrees).
All of these things are true. But they need to be of sufficient magnitude over simple, zero knowledge about user recommendations to warrant ongoing cost.
For the space I was in, recommending hotels, the incremental value of the use behavior was minimal.
On the other hand, getting the zero knowledge version working really well was worth an extra 11.6%.
What was interesting is that the naive algorithm got better over time and the incremental benefit of our new code got smaller.
Why?
Because the training data for the naive algo included user behavior from the new one. As we created better recommendations, users clicked on them and that fed into the old algo!
Coming to your product: what is to prevent a customer from using it for a few weeks, copying down the results, and then using those recommendations forever?
They’ll get most of the benefit for very small cost.