The main advantage of a blackbox ML solution is shorter development time to a useful performance level. Creating a transparent, explainable solution typically takes more time, more work, and a higher level of expertise to get to the same performance level. If the problem is complicated and the cost of a mistake is low, then your best approach today is likely to be blackbox. If the cost of a mistake is high, you should not even be considering a blackbox approach (at least not without transparent safeguards). There is a large gray area between these two extremes that requires good engineering judgement to choose well.
I think we definitely need more R&D dedicated to creating easier-to-use, lower-cost approaches to transparent, explainable ML. There is way too much effort devoted to blackbox R&D today. Ultimately, transparent, explainable ML should almost always beat blackbox ML due to a better ability to find and resolve hidden problems, such as dataset biases, that may be holding back performance.
I would also add in that blackbox ML solutions are deceptively "simple" approaches. Like a miraj in the desert for software developers: No domain expertise neccessary - or even the counter narriatives may come in... "domain expertise holds you back - you need to be an outsider with a new method", where as many problems require domain expertise for a realistic understanding of the complexity of the problem-solution.
Other narriatives that can creep in "once we get enough data..." which for some problems may as well be never.
Also, for some systems pertubing the system changes the problem meaning your dataset and model may not reflect changes in a dyanmic / complex system.
Also, blackbox ML approaches struggle to combine data from multiple modalities which is often relevant to many real world problems (at least with the majority of algorithms which are realistically implementable off the shelf).
> The main advantage of a blackbox ML solution is shorter development time to a useful performance level.
I think that's kinda true but kinda false. Its true that deep learning often makes feature engineering moot. however, a lot of deep learning projects takes machine learning engineers and applied scientists along with a host of other support engineers + hardware costs ( I seen people at work say: I only used a 16 GPU instances over a couple weeks of training )
meanwhile, I consider gradient boosting fairly interpretable and they can get pretty close results with a lot less tweaking and training time. If you want to go full non-black box, logistic regression with l1 penalization and only a little bit of feature engineering often does really well, probably a lot less development time / cost compared to those high cost PHD research scientists.
> Its true that deep learning often makes feature engineering moot. however, a lot of deep learning projects takes machine learning engineers and applied scientists along with a host of other support engineers + hardware costs
DL's reduction of the required feature/model engineering is a big deal for difficult problem domains where the cost of a mistake is low. That doesn't mean you don't still benefit from adding development resources, it's just that the development cost/performance tradeoff is still typically better than with a similar-performing explainable solution. I hope this will change in the future.
> ..., logistic regression with l1 penalization and only a little bit of feature engineering often does really well,...
While I agree that Lasso is far more explainable than DL, its explainability rapidly degrades as the useful feature dimension increases and it requires significant feature engineering for good performance on difficult problems.
I would disagree with that statement. There are many ways at getting to what the model is doing, feature importances, SHAP, etc. It's not as clear as logistic regression or a single decision tree, but its not a black box either.
This isn't how it works, at all. You're going to get better results with 1,000 parameters than with 100 explainable ones. There's a limit to how much humans can understand. We use machine learning to surpass that limit
> You're going to get better results with 1,000 parameters than with 100 explainable ones.
This is not always true.
Many problems are modeled very well with less than 100 parameters and adding more is of little-to-no benefit.
Many problems are naturally hierarchical such that simple models can be combined to yield a large number of explainable parameters. If done well, this can result in a high-performing solution. Admittedly, this is usually harder than just applying a blackbox.
In critical applications, an explainable model with benign failure modes (even if it has worse overall performance), can be far preferable to a blackbox with wildly unpredictable failure modes. From a utility standpoint, the explainable results are better.
> There's a limit to how much humans can understand. We use machine learning to surpass that limit
We can also work to improve our ability to discover and understand. I think that holds far more promise than improving our ability to do things we don't understand.
I think we definitely need more R&D dedicated to creating easier-to-use, lower-cost approaches to transparent, explainable ML. There is way too much effort devoted to blackbox R&D today. Ultimately, transparent, explainable ML should almost always beat blackbox ML due to a better ability to find and resolve hidden problems, such as dataset biases, that may be holding back performance.