Yeah, luckily, you can unit tests these and fix them. They are not concurrency bugs (again, luckily).
BTW, numeric differentiation can only be tested very limitedly (due to algorithmic complexity when you doing big matrix). It is much easier / effective to test against multiple implementations.
And it is always felt to me that has lineage from neural Turing machine line of work as prior. The transformative part was 1. find a good task (machine translation) and a reasonable way to stack (encoder-decoder architecture); 2. run the experiment; 3. ditch the external KV store idea and just use self-projected KV.
mmap is a good crutch when you 1. don't have busy polling / async IO API available and want to do some quick & dirty preloading tricks; 2. don't want to manage the complexity of in-memory cache, especially cross-processes ones.
Obviously if you have kernel-backed async IO APIs (io_uring) and willing to dig into the deeper end (for better managed cache), you can get better performance than mmap. But in many cases, mmap is "good-enough".
Faster compute helps, for things like vision language model that requires bigger context to be filled. My understanding is that ANE is still optimized for convolution load, and compute efficiency while the new neural accelerators optimized for flexibility and performance.
I am not an expert on ANE, but I think it is related to the size of register files and how that is smaller than what we need for GEMM on modern transformers (especially these fat ones with MoE).
AIUI the ANE makes use of data in unified memory, not in the register file. So this wouldn't be an inherent limitation. (OTOH, that's why it wastes memory bandwidth for most newer transformer models, which use heavily quantized data - the ANE will have to read padded/unquantized values and the fraction of memory bandwidth that's used for that padding is pure waste.)
Feels like a side-effect of forever 0.x version symptom (I am guilty of as well). Even though semi-ver says 0.x can do whatever, people don't associate enough disruptive changes to it, whereas 0.4.x if it is 1.x, then it is much clearer this is a 2.x release.
All things considered, this is probably just a tiny footnote in this software's life.
And 300ms for a DB call is slow, in any case. We really shouldn't accept that as normal cost of doing business. 300ms is only acceptable if we are doing scrypt type of things.
In some cases. Are looking up a single indexed row in a small K-V table? Yep, slow. Are you generating reports on the last 6 years of sales, grouped by division within larger companies? That might be pretty fast.
I'm not sure why you'd even generalize that so overly broadly.
That's all true, so long as you completely ignore doing any processing on the data, like evaluating the rows and selectively appending some of them into a data structure, then sorting and serializing the results, let alone optimizing the query plan for the state of the system at that moment and deciding whether it makes more sense to hit the indexes or just slurp in the whole table given that N other queries are also executing right now, or mapping a series of IO queries to their exact address in the underlying disks, and performing the parity checks as you read the data off the RAID and combine it into a single, coherent stream of not-block-aligned tuples.
There's a metric boatload of abstractions between sending a UTF-8 query string over the packet-switched network and receiving back a list of results. 300ms suddenly starts looking like a smaller window than it originally appears.
There is nothing for us to take away in this discussion. So let me be the first to tune down: all I want to say is: don't take that 300ms as given, it sits in this uncomfortable region too short to be an async op and too long to be noticeable (anything between 50ms and 2s fits this bill). Most likely the query is doing something suspicious and would benefit the most to take a closer look at.
I was totally with you until that last sentence, then you lost me again.
Saying a DB query is too long by giving an arbitrary number is like saying a rope is too long. That’s solely dependent on what you’re doing with it. It’s literally impossible to say that X is too long unless you know what it’s used for.
They were acquisition target since 2017 (from the OpenAI internal emails). So lacking of acquisition is not because lacking of interests. Let you wonder what happened in these due-diligence.
However, personalization (teleporting yourself into a video scene) is boring to me. At its core, it doesn't generate new experience to me. My experience is not defined by photos / videos I took on a trip.
BTW, numeric differentiation can only be tested very limitedly (due to algorithmic complexity when you doing big matrix). It is much easier / effective to test against multiple implementations.