I do really like the idea of an MNIST alternative to very quickly verify ideas. ...

qchris · on Dec 6, 2020

Genuine question: are there real-world image recognition tasks that requires training on more than, say, 10 or even 100 classes? I'm personally aware of only one that might come close, and that's because it's an image-based species detection module, and it's whole purpose is actively trying to be able to recognize a large number of very specific subgroups. But most of the other I can think of get maybe a couple dozen, and sometimes even as few as 4 or 5 classes where they're useful, and the accuracy within those cases is much more important than the sheer number of possibilities themselves.

I guess I'm just asking if COCO or ImageNet-trained networks are actually noticeably superior for most real-world tasks, or if it's just a metric that's used because the performance differences only show up in the long tail of the distribution.

janhenr · on Dec 6, 2020

> I guess I'm just asking if COCO or ImageNet-trained networks are actually noticeably superior for most real-world tasks, or if it's just a metric that's used because the performance differences only show up in the long tail of the distribution.

Given that for any real-world vision task you start from a pretrained model om those datasets they will in fact be noticably superior on the real world task after finetuning. Just because the quality of the features extracted through the backbone is better.