Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> assuming a 5ghz processor, that gives you 500 cycles per image if you do ten million a second

Modern CPUs don't quite work this way. Many instructions can be retired per clock cycle.

> Second of all L1 cache is at most in the hundreds of kilobytes, so the faces aren't in L1 but must be retrieved from elsewhere...??

Yea, from L2 cache. It's caches all the way down. That's how we make it go really fast. The prefetcher can make this look like magic if the access patterns are predictable (linear).



The keyword is CAN, there can also be huge penalties (random main-memory accesses are over a cycles typically), the parent was probably considering a regular image transform/comparison and 20 pixels per cycle even for low resolution 100x100 images is way above what we do today.

As others have mentioned, they're probably doing some kind of embedding like search primarily and then 500 cycles per face makes more sense, but it's not a full comparison.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: