I think this is way too much for a pure CS person. It is not likely they will ma...

tnecniv · on Nov 13, 2017

> Boosting is another example. It really is stranger than fiction.

Is this true? Boosting is pretty well formulated in the PAC framework and the classical algorithms (e.g. Adaboost) are well-characterized.

stochastic_monk · on Nov 13, 2017

You're correct. Boosting was directly formulated in the PAC framework.

(Source: http://l2r.cs.uiuc.edu/Teaching/CS446-17/LectureNotesNew/boo... "The original boosting algorithm was proposed as an answer to a theoretical question in PAC learning [The Strength of Weak Learnability; Schapire, 89.]")

It took a while, but there's been a lot of work lately explaining neural nets' performance over the last 5 years of so, from papers showing PAC learnability for specific architectures (https://arxiv.org/abs/1710.10174) to work saying that most local optima are close to global optima (http://www.offconvex.org/2016/03/22/saddlepoints/), to work saying that the optimization error incurred (as separate from approximation and estimation errors) serve as a form of regularization for deep neural networks.

And understanding how these things work helps improve and speed up these methods and models: it's hybrid algorithms which are enabling performance in time-series data and more complex tasks. The future will nearly certainly use neural networks as part of many algorithms, but I doubt that the full machinery will be simple feed-forward nets of ever-increasing sizes.