Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are a lot of moments not covered. For example:

- async/await runs in context of one thread, so there is no need for locks or synchronization. Unless one runs async/await in multiple threads to actually utilize CPU cores, then locks and synchronization are necessary again. This complexity may be hidden in some external code. For example instead of synchronizing access to a single database connection it is much easier to open one database connection per async task. However such approach may affect performance, especially with sqlite and postgres.

- error propagation in async/await is not obvious. Especially when one tries to group up async tasks. Happy eyeballs are a classic example.

- since network I/O was mentioned, backpressure should also be mentioned. CPython implementation of async/await notoriously lacks network backpressure causing some problems.



I have lots of issues with async/await, but this is my primary beef with async/await:

Remember the Gang of Four book "Design Patterns"? It was basically a cookbook on how to work around the deficiencies of (mostly) C++. Yet everybody applied those patterns inside languages that didn't have those deficiencies.

Rust can run multiple threads just fine--it's not Javascript. As such, it didn't have to use async/await. It could have tried any of a bunch of different solutions. Rust is a systems language, after all.

However, async/await was necessary in order to shove Rust down the throats of the Javascript programmers who didn't know anything else. Quoting without.boats:

https://without.boats/blog/why-async-rust/

> I drove at async/await with the diligent fervor of the assumption that Rust’s survival depended on this feature.

Whether async/await was even a good fit for Rust technically was of no consequence. Javascript programmers were used to async/await so Rust was going to have async/await so Rust could be jammed down the throats of the Javascript network services programmers--technical consequences be damned.


Async/await was invented for C#, another multithreaded language. It was not designed to work around a lack of true parallelism. It is instead designed to make it easier to interact with async IO without having to resort to manually managed thread pools. It basically codifies at the language level a very common pattern for writing concurrent code.

It is true though that async/await has a significant advantage compared to fibers that is related to single threaded code: it makes it very easy to add good concurrency support on a single thread, especially in languages which support both. In C#, it was particularly useful for executing concurrent operations from the single GUI thread of WPF or WinForms, or from parts of the app which interact with COM. This used the SingleThreadedExecutor, which schedules tasks on the current thread, so it's safe to run GUI updates or COM interactions from a Task, while also using any other async/await code, since tasks inherit their executor.


Yeah, Microsoft looked at callback hell, realized that they had seen this one before, dipped into the design docs for F# and lifted out the syntactic sugar of monads. And it worked fine. But really, async/await is literally callbacks. The keyword await just wraps the rest of the function in a lambda and stuffs it in a callback. It's fully just syntactic sugar. It's a great way of simplifying how callback hell is written, but it's still callback hell in the end. Where having everything run in callbacks makes sense, it makes sense. Where it doesn't it doesn't. At some point you will start using threads, because your use case calls for threads instead of callbacks.


Most compilers don't just wrap the rest of the function into a lambda but build a finite state machine with each await point being a state transition. It's a little bit more than just "syntactic sugar" for "callbacks". In most compilers it is most directly like the "generator" approach to building iterators (*function/yield is ancient async/await).

I think the iterator pattern in general is a really useful reference to keep in mind. Of course async/await doesn't replace threads just like iterators don't replace lists/arrays. There are some algorithms you can more efficiently write as iterators rather than sequences of lists/arrays. There are some algorithms you can more efficiently write as direct list/array manipulation and avoid the overhead of starting iterator finite state machines. Iterator methods are generally deeply composable and direct list/array manipulation requires a lot more coordination to compose. All of those things work together to build the whole data pipeline you need for your app. So too, async/await makes it really easy to write some algorithms in a complex concurrent environment. That async/await runs in threads and runs with threads. It doesn't eliminate all thinking about threads. async/await is generally deeply composable and direct thread manipulation needs more work to coordinate. In large systems you probably still need to think about both how you are composing your async/await "pipelines" and also how your threads are coordinated. The benefits of composition such as race/await-all/schedulers/and more are generally worth the extra complexity and overhead (mental and computation space/time), which is why the pattern has become so common so quickly. Just like you can win big with nicely composed stacks of iterator functions. (Or RegEx or Observables or any of the other many cases where designing complex state machines both complicates how the system works and eventually simplifies developer experience with added composability.)


Eh, that's true, and that's a convenient way of doing intermediate representation, since its very machine-friendly. But really, finite state machines are just callbacks, just as generators can be treated as just callbacks. There is no real logical difference, and it is their historical origin, even for generators which is just a neat syntax for what could have been done back in the day with a more explicit OO solution.

It does provide a more conceptual way of thinking about what those old callbacks would have meant though, which opens up thinking about scheduling them. Still, it's not something I'd rather do, if I need an asynchronous iterator I'll write one but if I need to start scheduling tasks then I'm using threads and leaving it to someone smarter than me.


I generally don't agree with the direction withoutboats went with asynchricity but you are reading in a whole lot more into that sentence than is really there. It is very clear (based on his writing, in this and other articles) that he went with the solution because he thinks it is the right one, on a technical level.

I don't agree, but making it sound like it was about marketing the language to JavaScript people is just wrong.


> was about marketing the language to JavaScript people is just wrong.

No it seems very right to me. Rust despite being "Systems language" was not satisfied with market size of systems programing and they really needed all those millions of JS programmers to make language a big success.


This is a lie. Async/await was developed to support systems that need to use non-blocking IO for performance reasons, not to appeal to JS programmers.


Threads have a cost. Context switching between them at the kernel level has a cost. There are some workloads that gain performance by multiplexing requests on a thread. Java virtual threads, golang goroutines, and dotnet async/await (which is multi threaded like Rust+tokio) all moved this way for _performance_ reasons not for ergonomic or political ones.

It's also worth pointing out that async/await was not originally a JavaScript thing. It's in many languages now but was first introduced in C#. So by your logic Rust introduced it so it could be "jammed down the throats" of all the dotnet devs..


> all moved this way for _performance_ reasons

They did NOT.

Async performance is quite often (I would even go so far as to say "generally") worse than single threaded performance in both latency AND throughput under most loads that programmers ever see.

Most of the complications of async are much like C#:

1) Async allows a more ergonomic way to deal with a prima donna GUI that must be the main thread and that you must not block. This has nothing to do with "performance"--it is a limitation of the GUI toolkit/Javascript VM/etc..

2) Async adds unavoidable latency overhead and everybody hits this issue.

3) Async nominally allows throughput scaling. Most programmers never gain enough throughput to offset the lost latency performance.


1) it offers a more ergonomic way for concurrency in general. `await Task.WhenAll(tasks);` is (in my opinion) more ergonomic than spinning up a thread pool in any language that supports both.

2) yes, there is a small performance overhead for continuations. Everything is a tradeoff. Nobody is advocating for using async/await for HFT, or in low level languages like C or Zig. We're talking nanoseconds here.. for a typical web API request that's in the 10's of ms that's a drop in the ocean.

3) I wouldn't say it's nominal! I'd argue most non-trivial web workloads would benefit from this increase in throughput. Pre-fork webservers like gunicorn can consume considerably more resources to serve the same traffic than an async stack such as uvicorn+FastAPI (to use Python as an example).

> Most of the complications of async are much like C#

Not sure where you're going with this analogy but as someone who's written back-end web services in basically every language (other than lisp, no hate though), C#/dotnet core is a pretty great stack. If you haven't tried it in a while you should give it a shot.


Eh. Async and to a lesser extent green threads are the only solutions to slowloris HTTP attacks. I suppose your other option is to use a thread pool in your server - but then you need to but hide your web server behind nginx to keep it safe. (And nginx is safe because it internally uses async IO).

Async is also usually wildly faster for networked services than blocking IO + thread pools. Look at some of the winners of the techempower benchmarks. All of the top results use some form of non blocking IO. (Though a few honourable mentions use go - with presumably a green thread per request):

https://www.techempower.com/benchmarks/

I’ve also never seen Python or Ruby get anywhere near the performance of nodejs (or C#) as a web server. A lot of the difference is probably how well tuned v8 and .net are, but I’m sure the async-everywhere nature of javascript makes a huge difference.


Async's perfect use case is proxies though- get a request, go through a small decision tree, dispatch the I/O to the kernel. You don't want proxies doing complex logic or computation, the stuff that creates bottlenecks in the cooperative multithreading.


Most API's (rest, graphql or otherwise) are effectively a proxy. Like you say, if you don't have complex logic and you're effectively mapping an HTTP request to a query, then your API code is just juggling incoming and outgoing responses and this evented/cooperative approach is very effective.


Where does the unavoidable latency overhead come from?

Do you have some benchmarks available?


The comment you are responding to is not wrong about higher async overhead, but it is wrong at everything else either out of lack of experience with the language or out of being confused about what it is that Task<T> and ValueTask<T> solve.

All asynchronous methods (as in, the ones that have async keyword prefixed to them) are turned into state machines, where to live across await, the method's variables that persist across it need to be lifted to a state machine struct, which is then often (but not always) needs to be boxed aka heap allocated. All this makes the cost of what would have otherwise been just a couple of method calls way more significant - single await like this can cost 50ns vs 2ns spent on calling methods.

There is also a matter of heap allocations for state machine boxes - C# is generally good when it comes to avoiding them for (value)tasks that complete synchronously and for hot async paths that complete asynchronously through pooling them, but badly written code can incur unwanted overhead by spamming async methods with await points where it could have been just forwarding a task instead. Years of bad practices arisen from low skill enterprise dev fields do not help this either, with only the switch to OSS and more recent culture shift aided by better out of box analyzers somewhat turning the tide.

This, however, does not stop C#'s task system from being extremely useful for achieving lowest ceremony concurrency across all programming languages (yes, it is less effort than whatever Go or Elixir zealots would have you believe) where you can interleave, compose and aggregate task-returning methods to trivially parallelize/fork/join parts of existing logic leading to massive code productivity improvement. Want to fire off request and do something else? Call .GetStringAsync but don't await it and go back to it later with await when you do need the result - the request will be likely done by then. Instant parallelism.

With that said, Rust's approach to futures and async is a bit different, where-as C#'s each async method is its own task, in Rust the entire call graph is a single task with many nested futures where the size of the sum of all stack frames is known statically hence you can't perform recursive calls within async there - you can only create a new (usually heap-allocated) which gives you what effectively looks a linked list of task nodes as there is no infinite recursion in calculating their sizes. This generally has lower overhead and works extremely well even in no-std no-alloc scenarios where cooperative multi-tasking is realized through a single bare metal executor, which is a massive user experience upgrade in embedded land. .NET OTOH is working on its own project to massively reduce async overhead but once the finished experiment sees integration in dotnet/runtime itself, you can expect more posts on this orange site about it.


> .NET OTOH is working on its own project to massively reduce async overhead

Where can I read more about that?


Initial experiment issue: https://github.com/dotnet/runtime/issues/94620

Experiment results write-up: https://github.com/dotnet/runtimelab/blob/e69dda51c7d796b812...

TLDR: The green threads experiment was a failure as it found (expected and obvious) issues that the Java applications are now getting to enjoy, joining their Go colleagues, while also requiring breaking changes and offering few advantages over existing model. It, however, gave inspiration to subsequent re-examination of current async/await implementation and whether it can be improved by moving state machine generation and execution away from IL completely to runtime. It was a massive success as evidenced by preliminary overhead estimations in the results.


The tl;dr that I got when I read these a few months ago was that C# relies on too much FFI which makes implementing green threads hard and on top of that would require a huge effort to rewrite a lot of stuff to fit the green thread model. Java and Go don’t have these challenges since Go shipped with a huge standard library and Java’s ecosystem is all written in Java since it never had good ffi until recently.


Surely you're not claiming that .NET's standard library is not extensive and not written in C#.

If you do, consider giving .NET a try and reading the linked content if you're interested - it might sway your opinion towards more positive outlook :)


> Surely you're not claiming that .NET's standard library is not extensive and not written in C#.

I’m claiming that MSFT seems to care really about P/Invoke and FFI performance and it was one of the leading reasons for them not to choose green threads. So there has to be something in .NET or C# or win forms or whatever that is influencing the decision.

I’m also claiming that this isn’t a concern for Java. 99.9% of the time you don’t go over FFI and it’s what lead the OpenJdk team to choose virtual threads.

> If you do, consider giving .NET a try

I’d love to, but dealing with async/await is a pain :)


You’ve never used it, so how can you know?


How do you know I've never used it? Do you have a crystal ball?


> So by your logic Rust introduced it so it could be "jammed down the throats" of all the dotnet devs..

You're missing his point. His point is that the most popular language, which has the most number of programmers forced the hand of Rust devs.

His point is not that the first language had this feature, it's that the most programmers used this feature, and that was due to the most popular programming language having this feature.


That Rust needed async/await to be palatable to JS devs would only be a problem if we think async/await is not needed in Rust, because it is only useful to work around limitations of JS (single-threaded execution, in this case). If instead async/await is a good feature in its own right (even if not critical), then JS forcing Rust's hand would be at best an annoyance.

And the idea that async/await was only added to JS to work around its limitations is simply wrong. So the OP is overall wrong: async/await is not an example of someone taking something that only makes sense in one language and using it another language for familiarity.


> So the OP is overall wrong: async/await is not an example of someone taking something that only makes sense in one language and using it another language for familiarity.

I don't really understand the counter argument here.

My reading of the argument[1] is that "Popularity amongst developers forced Rust devs hands in adding async". If this is the argument, then a counter argument of "It never (or only) made sense in the popular language (either)" is a non-sequitor.

IOW, if it wasn't added due to technical reasons (which is the original argument, IIRC), then explaining technical reasons for/against isn't a counter argument.

[1] i.e. Maybe I am reading it wrong?


You are not reading the claim wrong, but the claim is a lie. We did not add async/await to Rust because it was popular but because it was the right solution for Rust. If you actually read my post that this liar linked to, you will find a detailed explanation of the technical history behind the decision.


You are not reading it wrong, and your statements are accurate.

My broader point is that the possibility of there being a "technically better" construct was simply not in scope for Rust. In order for Rust to capture Javascript programmers, async/await was the only construct that could possibly be considered.

And, to be fair, it worked. Rust's growth has been almost completely on the back of network services programming.


This comment is a lie.


That is his claim, but he is lying.


I would damn this, if Async/Await wasn't a good enough (TM) solution for certain problems where Threads are NOT good enough.

Remember: there is a reason why Async/Await was created B E F O R E JavaScript was used for more than sprinkling a few fancy effects on some otherwise static webpages


Strong disagree.

> Rust can run multiple threads just fine--it's not Javascript. As such, it didn't have to use async/await. It could have tried any of a bunch of different solutions. Rust is a systems language, after all.

it allows you to have semantic concurrency where there are no threads available. like, you known, on microncontrollers without an (RT)OS where such a systems programming language is a godsend.

seriously, using async/await on embedded makes so much sense.


> Rust can run multiple threads just fine

Rust is also used in environments which don't support threads. Embedded, bare metal, etc.


async/await is just a different concurrency paradigm with different strengths and weaknesses than threads. Rust has support for threaded concurrency as well though the ecosystem for it is a lot less mature.


Every word you've written is false, slanderous and idiotic. You are quoting a post in which I explain at length why async/await was the right fit for Rust technically. You are either illiterate or malignant.

Despite your evident ignorance, there are many network services that are not written in JavaScript. In fact, there are many that are written in C or C++. This is the addressable market of async Rust. Appealing to JavaScript users was not in any way a motivating factor for the development of async/await in Rust. Not at all!


Threads are much much slower than async/await.


Async/await just like threads is a concurrency mechanism and also always requires locks when accessing the shared memory. Where does your statement come from?


If you perform single threaded async in Rust, you can drop down to the cheap single threaded RefCell rather than the expensive multithreaded Mutex/RwLock


That's one example of a lock you might eliminate, but there are plenty of other cases where it's impossible to eliminate locks even while single threaded.

Consider, for example, something like this (not real rust, I'm rusty there)

    lock {
      a = foo();
      b = io(a).await;
      c = bar(b);
    }
Eliminating this lock is unsafe because a, b, and c are expected to be updated in tandem. If you remove the lock, then by the time you reach c, a and b may have changed under your feet in an unexpected way because of that await.


Yeah but this problem goes away entirely if you just don’t await within a critical region like that.

I’ve been using nodejs for a decade or so now. Nodejs can also suffer from exactly this problem. In all that time, I think I’ve only reached for a JS locking primitive once.


There is no problem here with the critical region. The problem would be removing the critical region because "there's just one thread".

This is incorrect code

      a = foo();
      b = io(a).await;
      c = bar(b);
Without the lock, `a` can mutate before `b` is done executing which can mess with whether or not `c` is correct. The problem is if you have 2 independent variables that need to be updated in tandem.

Where this might show up. Imagine you have 2 elements on the screen, a span which indicates the contents and a div with the contents.

If your code looks like this

    mySpan.innerText = "Loading ${foo}";
    myDiv.innerText = load(foo).await;
    mySpan.innerText = "";
You now have incorrect code if 2 concurrent loads happen. It could be the original foo, it could be a second foo. There's no way to correctly determine what the content of `myDiv` is from an end user perspective as it depends entirely on what finished last and when. You don't even know if loading is still happening.


I absolutely agree that that code looks buggy. Of course it is - if you just blindly mix view and model logic like that, you’re going to have a bad day. How many different states can the system be in? If multiple concurrent loads can be in progress at the same time, the answer is lots.

But personally I wouldn’t solve it with a lock. I’d solve it by making the state machine more explicit and giving it a little bit of distance from the view logic. If you don’t want multiple loads to happen at once, add an is_loading variable or something to track the loading state. When in the loading state, ignore subsequent load operations.


> add an is_loading variable or something to track the loading state.

Which is definitionally a mutex AKA a lock. However, it's not a lock you are blocking on but rather one that you are trying and leaving.

I know it doesn't look like a traditional lock, but in a language like javascript or python it's a valid locking mechanism. For javascript that's because of the single thread execution model a boolean variable is guaranteed to be consistently set for multiple concurrent actions.

That is to say, you are thinking about concurrency issues, you just aren't thinking about them in concurrency terms.

Here's the Java equivalent to that concept

https://docs.oracle.com/javase/8/docs/api/java/util/concurre...


Yeah I agree. The one time I wrote a lock in javascript it worked like you were hinting at. You could await() the lock's release, and if multiple bits of code were all waiting for the lock, they would acquire it in turn.

But again, I really think in UI code it makes a lot more sense to be clear about what the state is, model it explicitly and make the view a "pure" expression of that state. In the code above:

- The state is 0 or more promises loading data.

- The state is implicit. Ie, the code doesn't list the set of loading promises which are being awaited at any point in time. Its not obvious that there is a collection going on.

- The state is probably wrong. The developer probably wants either 0 or 1 loading states. (Or maybe a queue of them). Because the state hasn't been modelled explicitly, it probably hasn't been considered enough

- The view is updated incorrectly based on the state. If 2 loads happen at the same time, then 1 finishes, the UI removes the "loading..." indicator from the UI. Correct view logic should ensure that the UI is deterministic based on the internal state. 1 in-progress load should result in the UI saying "loading...".

Its a great example. With code like this I think you should always carefully and explicitly consider all of the states of your system, and how the state should change based on user action. Then all UI code can flow naturally from that.

A lock might be a good tool. But without thinking about how you want the program to behave, we have no way to tell. And once you know how you want your program to behave, I find locks to be usually unnecessary.


I think a lot of this type of problem goes away with immutable data and being more careful with side effects (for example, firing them all at once at the end rather than dispersed through the calculation)


> Where does your statement come from?

This is how async/await works in Node (which is single-threaded) so most developers think this is how it works in every technology.


Even in Node, if you perform asynchronous operations on a shared resource, you need synchronization mechanisms to prevent interleaving of async functions.

There has been more than one occasion when I "fixed" a system in NodeJS just by wrapping some complex async function up in a mutex.


This lacks quite a bit of nuance. In node you are guaranteed that synchronous code between two awaits will run to completion before another task(that could access your state) from the event loop gets a turn; with multi-threaded concurrency you could be preempted between any two machine instructions. So while you _do_ have to serialize access to shared IO resources, you do _not_ have to serialize access to memory(just add the connection to the hashset, no locks).

What you usually see with JS for concurrency of shared IO resources in practice is that they are "owned" by the closure of a flow of async execution and rarely available to other flows. This architecture often obviates the need to lock on the shared resource at all as the natural serialization orchestrated by the string of state machines already naturally accomplishes this. This pattern was even quite common in the CPS style before async/await.

For example, one of the first things an app needs do before talking to a DB is to get a connection which is often retrieved by pulling from a pool; acquiring the reservation requires no lock, and by virtue of the connection being exclusively closed over in the async query code, it also needs no locking. When the query is done, the connection can be replaced to the pool sans locking.

The place where I found synchronization most useful was in acquiring resources that are unavailable. Interestingly, an async flow waiting on a signal for a shared resource resembles a channel in golang in how it shifts the state and execution to the other flow when a pooled resource is available.

All this to say, yeah I'm one of the huge fans of node that finds rust's take on default concurrency painfully over complicated. I really wish there was an event-loop async/await that was able to eschew most of the sync, send, lifetime insanity. While I am very comfortable with locks-required multithreaded concurrency as well, I honestly find little use for it and would much prefer to scale by process than thread to preserve the simplicity of single-threaded IO-bound concurrency.


> So while you _do_ have to serialize access to shared IO resources, you do _not_ have to serialize access to memory(just add the connection to the hashset, no locks).

No, this can still be required. Nothing stops a developer setting up a partially completed data structure and then suspending in the middle, allowing arbitrary re-entrancy that will then see the half-finished change exposed in the heap.

This sort of bug is especially nasty exactly because developers often think it can't happen and don't plan ahead for it. Then one day someone comes along and decides they need to do an async call in the middle of code that was previously entirely synchronous, adds it and suddenly you've lost data integrity guarantees without realizing it. Race conditions appear and devs don't understand it because they've been taught that it can't happen if you don't have threads!


> So while you _do_ have to serialize access to shared IO resources, you do _not_ have to serialize access to memory

Yes, in Node you don't get the usual data races like in C++, but data-structure races can be just as dangerous. E.g. modifying the same array/object from two interleaved async functions was a common source of bugs in the systems I've referred to.

Of course, you can always rely on your code being synchronous and thus not needing a lock, but if you're doing anything asynchronous and you want a guarantee that your data will not be mutated from another async function, you need a lock, just like in ordinary threads.

One thing I deeply dislike about Node is how it convinces programmers that async/await is special, different from threading, and doesn't need any synchronisation mechanisms because of some Node-specific implementation details. This is fundamentally wrong and teaches wrong practices when it comes to concurrency.


But single-threaded async/await _is_ special and different from multi-threaded concurrency. Placing it in the same basket and prescribing the same method of use is fundamentally wrong and fails to teach the magic of idiomatic lock free async javascript.

I'm honestly having a difficult time creating a steel man js sample that exhibits data races unless I write weird C-like constructs and ignore closures and async flows to pass and mutate multi-element variables by reference deep into the call stack. This just isn't how js is written.

When you think about async/await in terms of shepherding data flows it becomes pretty easy to do lock free async/await with guaranteed serialization sans locks.


> I'm honestly having a difficult time creating a steel man js sample that exhibits data races

I can give you a real-life example I've encountered:

    const CACHE_EXPIRY = 1000; // Cache expiry time in milliseconds

    let cache = {}; // Shared cache object

    function getFromCache(key) {
      const cachedData = cache[key];
      if (cachedData && Date.now() - cachedData.timestamp < CACHE_EXPIRY) {
        return cachedData.data;
      }
      return null; // Cache entry expired or not found
    }

    function updateCache(key, data) {
      cache[key] = {
        data,
        timestamp: Date.now(),
      };
    }

    var mockFetchCount = 0;

    // simulate web request shorter than cache time
    async function mockFetch(url) {
      await new Promise(resolve => setTimeout(resolve, 100));
      mockFetchCount += 1;
      return `result from ${url}`;
    }

    async function fetchDataAndUpdateCache(key) {
      const cachedData = getFromCache(key);
      if (cachedData) {
        return cachedData;
      }

      // Simulate fetching data from an external source
      const newData = await mockFetch(`https://example.com/data/${key}`); // Placeholder fetch

      updateCache(key, newData);
      return newData;
    }

    // Race condition:
    (async () => {
      const key = 'myData';

      // Fetch data twice in a sequence - OK
      await fetchDataAndUpdateCache(key);
      await fetchDataAndUpdateCache(key);
      console.log('mockFetchCount should be 1:', mockFetchCount);

      // Reset counter and wait cache expiry
      mockFetchCount = 0;
      await new Promise(resolve => setTimeout(resolve, CACHE_EXPIRY));

      // Fetch data twice concurrently - we executed fetch twice!
      await Promise.all([fetchDataAndUpdateCache(key), fetchDataAndUpdateCache(key)]);
      console.log('mockFetchCount should be 1:', mockFetchCount);
    })();

This is what happens when you convince programmers that concurrency is not a problem in JavaScript. Even though this cache works for sequential fetching and will pass trivial testing, as soon as you have concurrent fetching, the program will execute multiple fetches in parallel. If server implements some rate-limiting, or is simply not capable of handling too many parallel connections, you're going to have a really bad time.

Now, out of curiosity, how would you implement this kind of cache in idiomatic, lock-free javascript?


> how would you implement this kind of cache in idiomatic, lock-free javascript?

The simplest way is to cache the Promise<data> instead of waiting until you have the data:

    -async function fetchDataAndUpdateCache(key: string) {
    +function fetchDataAndUpdateCache(key: string) {
       const cachedData = getFromCache(key);
       if (cachedData) {
         return cachedData;
       }

       // Simulate fetching data from an external source
     -const newData = await mockFetch(`https://example.com/data/${key}`); // Placeholder fetch
     +const newData = mockFetch(`https://example.com/data/${key}`); // Placeholder fetch

       updateCache(key, newData);
       return newData;
     }
From this the correct behavior flows naturally; the API of fetchDataAndUpdateCache() is exactly the same (it still returns a Promise<result>), but it’s not itself async so you can tell at a glance that its internal operation is atomic. (This does mildly change the behavior in that the expiry is now from the start of the request instead of the end; if this is critical to you you can put some code in `updateCache()` like `data.then(() => cache[key].timestamp = Date.now()).catch(() => delete cache[key])` or whatever the exact behavior you want is.)

I‘m not even sure what it would mean to “add a lock” to this code; I guess you could add another map of promises that you’ll resolve when the data is fetched and await on those before updating the cache, but unless you’re really exposing the guts of the cache to your callers that’d achieve exactly the same effect but with a lot more code.


Ok, that's pretty neat. Using Promises themselves in the cache instead of values to share the source of data itself.

While that approach has a limitation that you cannot read the data from inside the fetchDataAndUpdateCache (e.g. to perform caching by some property of the data), that goes beyond the scope of my example.

> I‘m not even sure what it would mean to “add a lock” to this code

It means the same as in any other language, just with a different implementation:

    class Mutex {
        locked = false
        next = []

        async lock() {
            if (this.locked) {
                await new Promise(resolve => this.next.push(resolve));
            } else {
                this.locked = true;
            }
        }

        unlock() {
            if (this.next.length > 0) {
                this.next.shift()();
            } else {
                this.locked = false;
            }
        }
    }
I'd have a separate map of keys-to-locks that I'd use to lock the whole fetchDataAndUpdateCache function on each particular key.


Don't forget to fung futures that are fungible for the same key.

ETA: I appreciate the time you took to make the example, also I changed the extension to `mjs` so the async IIFE isn't needed.

  const CACHE_EXPIRY = 1000; // Cache expiry time in milliseconds
  
  let cache = {}; // Shared cache object
  let futurecache = {}; // Shared cache of future values
  
  function getFromCache(key) {
    const cachedData = cache[key];
    if (cachedData && Date.now() - cachedData.timestamp < CACHE_EXPIRY) {
      return cachedData.data;
    }
    return null; // Cache entry expired or not found
  }
  
  function updateCache(key, data) {
    cache[key] = {
      data,
      timestamp: Date.now(),
    };
  }
  
  var mockFetchCount = 0;
  
  // simulate web request shorter than cache time
  async function mockFetch(url) {
    await new Promise(resolve => setTimeout(resolve, 100));
    mockFetchCount += 1;
    return `result from ${url}`;
  }
  
  async function fetchDataAndUpdateCache(key) {
    // maybe its value is cached already
    const cachedData = getFromCache(key);
    if (cachedData) {
      return cachedData;
    }
  
    // maybe its value is already being fetched
    const future = futurecache[key];
    if(future) {
      return future;
    }
  
    // Simulate fetching data from an external source
    const futureData = mockFetch(`https://example.com/data/${key}`); // Placeholder fetch
    futurecache[key] = futureData;
  
    const newData = await futureData;
    delete futurecache[key];
  
    updateCache(key, newData);
    return newData;
  }
  
  const key = 'myData';
  
  // Fetch data twice in a sequence - OK
  await fetchDataAndUpdateCache(key);
  await fetchDataAndUpdateCache(key);
  console.log('mockFetchCount should be 1:', mockFetchCount);
  
  // Reset counter and wait cache expiry
  mockFetchCount = 0;
  await new Promise(resolve => setTimeout(resolve, CACHE_EXPIRY));
  
  // Fetch data twice concurrently - we executed fetch twice!
  await Promise.all([...Array(100)].map(() => fetchDataAndUpdateCache(key)));
  console.log('mockFetchCount should be 1:', mockFetchCount);


I see, this piece of code seems to be crucial:

    // maybe its value is already being fetched
    const future = futurecache[key];
    if(future) {
      return future;
    }
It indeed fixes the problem in a JS lock-free way.

Note that, as wolfgang42 has shown in a sibling comment, the original cache map isn't necessary if you're using a future map, since the futures already contain the result:

    async function fetchDataAndUpdateCache(key) {
        // maybe its value is cached already
        const cachedData = getFromCache(key);
        if (cachedData) {
          return cachedData;
        }

        // Simulate fetching data from an external source
        const newDataFuture = mockFetch(`https://example.com/data/${key}`); // Placeholder fetch

        updateCache(key, newDataFuture);
        return newDataFuture;
    }
---

But note that this kind of problem is much easier to fix than to actually diagnose.

My hypothesis is that the lax attutide of Node programmers towards concurrency is what causes subtle bugs like these to happen in the first place.

Python, for example, also has single-threaded async concurrency like Node, but unlike Node it also has all the standard synchronization primitives also implemented in asyncio: https://docs.python.org/3/library/asyncio-sync.html


Wolfgang's optimization is very nice, I also found interesting his signal of a non-async function that returns a promise as an "atomic". I don't particularly like typed JS, so it would be less visible to me.

Absolutely agree on the observability of such things. One area I think shows some promise, though the tooling lags a bit, is in async context[0] flow analysis.

One area I have actually used it so far is in tracking down code that is starving the event loop with too much sync work, but I think some visualization/diagnostics around this data would be awesome.

If we view Promises/Futures as just ends of a string of a continued computation, whos resumption is gated by some piece of information, the points between where you can weave these ends together is where the async context tracking happens and lets you follow a whole "thread" of state machines that make up the flow.

Thinking of it this way, I think, also makes it more obvious how data between these flows is partitioned in a way that it can be manipulated without locking.

As for the node dev's lax attitude, I would probably be more agressive and say it's an overall lack of formal knowledge on how computing and data flow works. As an SE in DevOps a lot of my job is to make software work for people that don't know how computers, let alone platforms, work.

[0]: https://nodejs.org/api/async_context.html


async can be scarier for locks since a block of code might depend on having exclusive access, and since there wasn't an await, it got it. Once you add an await in the middle, the code breaks. Threading at least makes you codify what actually needs exclusive access.

async also signs you up for managing your own thread scheduling. If you have a lot of IO and short CPU-bound code, this can be OK. If you have (or occasionally have) CPU-bound code, you'll find yourself playing scheduler.


Yeah once your app gets to be sufficiently complex you will find yourself needing mutexes after all. Async/await makes the easy parts of concurrency easy but the hard parts are still hard.


> backpressure should also be mentioned

I ran into this when I joined a team using nodejs. Misc services would just ABEND. Coming from Java, I was surprised by this oversight. It was tough explaining my fix to the team. (They had other great skills, which I didn't have.)

> error propagation in async/await is not obvious

I'll never use async/await by choice. Solo project, ...maybe. But working with others, using libraries, trying to get everyone on the same page? No way.

--

I haven't used (language level) structured concurrency in anger yet, but I'm placing my bets on Java's Loom Project. Best as I can tell, it'll moot the debate.


> async/await runs in context of one thread,

Not in Rust.


There is a single thread executor crate you can use for that case if it’s what you desire, FWIW.


Yes of course, but the async/await semantics are not designed only to be single threaded. Typically promises can be resumed on any executor thread, and the language is designed to reflect that.


This is completely wrong. You gotta learn about Send and Sync in Rust before you speak.

Rust makes no assumptions and is explicitly designed to support both single and multi threaded executors. You can have non-Send Futures.


I'm fully aware of this, thanks @iknowstuff.

>>>>> Typically promises are designed...

I'm merely saying Rust async is not restricted to single threaded like many other languages design their async to be, because most people coming from Node are going to assume async is always single threaded.

Most people who write their promise implementations make them Send so they work with Tokio or Async-Std.

Relax, my guy. The shitty tone isn't necessary.

EDIT: Ah, your entire history is you just arguing with people. Got it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: