More

bruth · 2025-10-25T19:38:59 1761421139

Unfortunately the LLM is leading you astray :)

bruth · 2025-10-25T19:35:15 1761420915

(disclaimer: I am the VP of Prod/Eng at Synadia)

NATS - An application connectivity technology (L7). It was originally designed for low-latency M:N messaging, and that is still true today. In 2018, native multi-tenancy, clustering options, etc. got introduced. The persistence subsystem (JetStream) was introduced in 2021. It has a completely different design than Kafka, but with overlapping use cases. For better or worse, we get compared to Kafka a lot and virtually everyone who engages realizes the advantages and opportunities. NATS is much more flexible for application use cases, for example, it provides KeyValue and ObjectStore abstractions on top of the core persistent stream. There are a plethora of other details, but that is the basic gist. Overall, it has a lot of batteries included for building everything from embedded applications to large scale cloud-to-edge systems.

Synadia - The founder (Derek) created NATS. We are the primary maintainers of the project. We build products and services on top of NATS including a global SaaS where you can sign up and use "hosted NATS" with additional features. We offer a BYOC model, one of which we manage for you, or a Kubernetes-based self-service one that you deploy yourself. We also support fully self-hosted for customers that need to run in their own data centers or at the edge.

Regarding the comment re: the website, there are improvements we have in the works. Happy to engage and help clarify anything that is confusing.

lxe · 2025-10-26T01:10:19 1761441019

It would be great to have the questions of “what is it? What is it for?” Answered quickly and succinctly above the fold on the marketing site.

You probably have an influx of traffic that you could convert to customers through “what is this? -> oh cool I could use this!” pipeline if the marketing website enabled that.

What is NATS, how does it compare to other similar software, and why use a hosted solution… all this should be easily found.

And if I see a “enter name and email to download a resource” it just immediately turns me off from even engaging with the site.

bruth · on June 14, 2024

I don't know Simon personally, I do know he has been blogging for a very long time and many of his posts are for his own benefit of recall, not necessarily intended for an external audience. Also He did not post this, a different person did.

bruth · on March 9, 2024

(Disclaimer: I am a NATS maintainer and work for Synadia)

The parent comment may have been referring to the fact that NATS has support for durable (and replicated) work queue streams, so those could be used directly for queuing tasks and having a set of workers dequeuing concurrently. And this is regardless if you would want to use Nex or not. Nex is indeed fairly new, but the team on is iterating on it quickly and we are dog-fooding it internally to keep stabilizing it.

The other benefits of NATS is the built-in multi-tenancy which would allow for distinct applications/teams/contexts to have an isolated set of streams and messaging. It acts as a secure namespace.

NATS supports clustering within a region or across regions. For example, Synadia hosts a supercluster in many different regions across the globe and across the three major cloud providers. As it applies to distributed work queues, you can place work queue streams in a cluster within a region/provider closest to the users/apps enqueuing the work, and then deploy workers in the same region for optimizing latency of dequeuing and processing.

Could be worth a deeper look on how much you could leverage for this use case.

bruth · on Oct 9, 2023

This post is about a product for NATS.

I presume you are talking about the roadmap after the 2.10 release?

zinclozenge · on Oct 10, 2023

Yea, after reading the article I went and read the 2.10 release blog post and then came back and commented as if the article posted was the 2.10 post. That was my mistake.

bruth · on Oct 9, 2023

Auth callout is post-NATS client authentication, so it would not solve the "auth web flow" for authentication. Instead, the resulting token from that would be set as a cookie that then would be passed into the nats.ws client connection. The auth callout service would use that token to map to the concrete NATS user. The mechanism of doing that is up to the implementation. One option is to manage NATS claims into the OIDC provider (for the user authenticating) and then the auth service would decode that source JWT and extract the NATS claims and generate the NATS user JWT in the response.

wtatum · on Oct 9, 2023

Thanks that observation is extremely helpful. If I have it down then the intended flow looks something like?

- Web client is directed to some token vending service. This service implement authn in a manner of its choosing (i.e. OAuth) then sets a NATS client JWT in the cookie per https://docs.nats.io/running-a-nats-service/configuration/se... - Nats.ws client connection provides cookie during connection to perform client auth - If further authz/fine-grained control is needed the auth callout mechanism can be used. This would have access to the provided cookie/token so any claims needed for access control could be stapled on during step one and used at this point?

For GPs original question -- I'm running a fairly old Keycloak version (v8) but it does appear to set a JWT in KEYCLOAK_IDENTITY and KEYCLOAK_IDENTITY_LEGACY.

Am I right in understanding that IFF the token is signed with Ed25519 and both sub and iss are an NKEY value this is sufficient for NATS to accept that cookie as a credential?

bruth · on Oct 9, 2023

Yes that reads correct. The `sub` would a NATS user public nkey, the `iss` would be the NATS account public nkey (either the issuer nkey in config-mode or existing nkey in decentralized auth).

As long as it can verify the chain of trust for the user JWT that is returned, it should work.

The three schema types are shown here: https://docs.nats.io/running-a-nats-service/configuration/se...

auth request comes in -> generate user jwt, sign + encode -> respond with auth response.

As long as the necessary bits of the response and user JWT conform, it will work.

bruth · on Oct 9, 2023

> Seems like I need to write a service utilizing NATS which talks to the OIDC server.

Auth callout was designed to be a generic extension point to delegate authentication and generate dynamic a user JWT that NATS understands (permissions, limits, etc). It enables an arbitrary backend to be integrated with, not tied specifically to OIDC. But indeed, this requires implementing a service that does this integration.

> NATS auth story is a complicated one, and now with auth callout it's even more complicated.

There is a spectrum of auth options, starting with simple config-based, token or user/pass leading up to decentralized auth for use cases that need it. Auth callout is an opt-in thing, so it should only be adopted if it is truly necessary.

> you have to have dedicated functions which you would call in the request handler [...] instead of being able to define a chain of middleware functions

I don't quite understand this statement. Wrapping a NATS handler is the same approach as wrapping an HTTP handler (within the same client app). The function would take a handler and return a handler. There can be inspection of the message within that function and the choice of calling the next handler, responding early, doing some external call, or doing nothing.

If you want to distribute this middleware, then you would need to have separate, explicit subjects that each middleware would subscribe to and then publish to for the next element in the chain (for a choreography approach).

There is also the "message slip pattern" where the ingest component sets the path as metadata (e.g. headers) that each middleware component uses that to know (at request time) which subject to publish the result to next.

lakomen · on Oct 9, 2023

Regarding middleware, last time I tried there was a lot of information to process. Now I tried again and

  func loggingMiddleware(next micro.Handler) micro.Handler {
 return micro.HandlerFunc(func(req micro.Request) {
  spew.Dump(req.Data())
  next.Handle(req)
 })
  }

I'll remove the middleware part. EDIT: Seems I can't edit the post anymore.

Regarding auth, yes, spectrum of options, but for "production" use it's very complicated and you don't want a shared key, you need the whole operator, account, user thing, and external auth aka auth callout. The documentation is lacking. nkey, xkey what? It's confusing is complicated. Managing auth is a pain, more than once I've been locked out of my test server because of how complicated auth is and had to wipe it. It's so complicated because you have have isolated "namespaces" or operator spaces? Naming an organization operator, or a solution or service group doesn't help. user is synonymous with client. All that you don't get from the documentation, you have to watch Youtube videos to understand it. The documentation is bad.

How do I consume a stream from the deno/ws client without the whole client app blocking? No idea. The code snippets don't help. All there is, is a pay for company that offers to host NATS clusters, which you can pay to ask. Slack is not a good medium.

I'm guessing that the documentation is bad so one has to pay for support. And that's a no-go for me. I get that you spent time and effort into it, I'm assuming you're one of the NATS people, but what good is it when I don't know how to effectively use the software? The natsbyexample page is also hard to digest.

You're in your own little bubble and expect people to be mind readers.

If it's too complicated people won't adopt it. I know I won't because it's too complicated.

I would've loved to build upon it, because the core functionality is nice and works, easy to grasp.

I've not seen a single Jetstream stream and consumer tutorial with the new API and nats.ws and Go. Old API sure, STAN sure. New API no. Some real world stuff, not just code demo snippets. Why not? Because it's complicated.

Auth callout, that was the big thing everyone was waiting for? I would work on actually solving problems instead of building components so I can use this extra piece of software who's documentation is so bad.

You know how many Kafka tutorials there are? A LOT.

I want to like it, it seems very promising, but it suffers from the "coder no likey documentation" illness.

Today I saw a Twitch stream of some Netflix dev, talking about DX, developer experience.

NATS' DX has a lot of room for improvement, to put it friendly.

bruth · on Oct 9, 2023

> You're in your own little bubble and expect people to be mind readers.

Out of curiosity, have you asked questions in Slack or Github? If so, and you had a bad experience with the interaction, I get the sentiment and would offer help. But this comment is not constructive without context.

> I'm guessing that the documentation is bad so one has to pay for support.

The NATS project has been open source for around 12 years, and part of the CNCF since 2018. This is an incorrect statement and very poor assumption to make because the documentation doesn't make sense for you.

I 100% agree it can be improved and we are working on a new docs site, but it is not quite ready.

In case its helpful, there is an increasing collection of examples on https://natsbyexample.com with new JetStream client examples among others. If you have specific requests, feel free to open issues in the corresponding repo: https://github.com/ConnectEverything/nats-by-example

Kinrany · on Oct 10, 2023

It really doesn't help that Slack requires an account, isn't searchable and doesn't preserve history :)

bruth · on Oct 10, 2023

If a learning resource is lacking or confusing, Slack or GitHub issues/discussions is a way to engage and provide feedback so they can be improved. If the docs were confusing, there are other channels to get help to unblock folks. The outcome of that interaction would lead to improvements in the docs.

Kinrany · on Oct 10, 2023

Fair! And I have to say that the docs are pretty good considering the huge surface they cover! Same underlying API, but so many languages.

bruth · on Feb 4, 2023

Nice, have you come across NATS? https://nats.io. The server natively supports WebSockets. There are many clients including Deno, Node, WebSockets, Rust, Go, C, Python, etc.

In addition to stateless messaging, it supports durable streams, and optimized API layers on top like key-value, and object storage.

The server also natively supports MQTT 3.1.1.

SpaghettiX · on Feb 4, 2023

Nats is not something I see as a competitor for external clients (browsers, mobile apps), primarily because it doesn't handle reconnections / message delivery / quality-of-service / at-least-once or exactly-once delivery (except for MQTT).

> When the connection is lost, your application would have to re-create it and all subscriptions if any. https://github.com/nats-io/stan.go#connection-status

Therefore, I don't see what it adds here. It seems designed for service communication, not client-server. They also don't list browsers as a use case https://docs.nats.io/nats-concepts/overview#use-cases. (though it is of course possible, it's just not ideal IMHO.)

They still have a js/browser client library though if you want to use them: https://github.com/nats-io/nats.ws. And yes, their servers "have websocket support".

bruth · on Feb 4, 2023

In fact it does all of these things now properly! STAN (NATS Streaming) was deprecated two years ago in favor of a new embedded subsystem called JetStream: https://docs.nats.io/nats-concepts/jetstream released in March 2021.

SpaghettiX · on Feb 4, 2023

Even with NATS jetstream, NATS has a focus on service communication.

"It supports websockets" and "qos" does not mean it will work robustly with web apps if nobody uses NATS for that use case. See https://github.com/nats-io/nats.ws/issues/172 for an example issue. If NATS is not used for websockets in browsers, it will have a mine field of issues to fix. And what about all the other clients (mobile, mobile web)? Sure there may be a NATS client library for it, but it won't handle user connectivity issues, because again it's aimed at service communication where the network is great.

People are using NATS in kubernetes, not web browsers.

bruth · on Feb 4, 2023

> Even with NATS jetstream, NATS has a focus on service communication.

It indeed excels at service communication as well. However, a core use case for NATS is the edge, be it your definition (browsers and mobile), but also in cars, factories, tractors, low-orbit satellites, etc, whether it is running on Kubernetes, k3s, or bare metal.

The issue you called out is a Firefox-specific issue, but it will be addressed and not indicative of an inherit limitation of NATS.

Check out this playlist of a live event I organized last fall with a variety of live demos: https://youtube.com/playlist?list=PLgqCaaYodvKY6xRbvB6ffON0_...

SpaghettiX · on Feb 4, 2023

Currently, the only people I see talking about NATS for edge is Synadia - they're also not very specific. In theory/documentation, "edge" is a core NATS use case, but in practice why does NATS compare themselves to Kafka and microservices? Most of that playlist is not edge-focused. Can you explain what concepts and problems you have to solve to support the "edge" - none of that is in your website.

> The issue you called out is a Firefox-specific issue, but it will be addressed and not indicative of an inherit limitation of NATS.

My point is NATS is not being used in browsers, mobile apps or edge use cases. It doesn't even explore the concepts. It looks like it doesn't care about Firefox. For IoT, what does NATS bring on top of MQTT? NATS ends up being an MQTT broker so it will have to compete with all of them.

Why don't you start comparing yourself to products and technology that serve the edge (other edge-focused companies (ably, pusher, pubnub), and other MQTT brokers)?

Side note: Would appreciate it if you disclosed your affiliation with Synadia and NATS before advertising it.

bruth · on Feb 4, 2023

> Would appreciate it if you disclosed your affiliation with Synadia and NATS before advertising it.

Fair point, but I did not mention Synadia. FWIW, I have been a NATS user for seven+ years prior to joining Synadia so I was speaking on behalf of myself and experience with the tech.

Also fair point that the nats.io website does not highlight this strongly. The NATS maintainers are aware (nearly all employed by Synadia) and we are working on it.

I disagree that simply because it is not advertised as a "edge" technology that it is not one of the best-in-class techs for edge. It simply means, we are doing a poor job at awareness.

> Why don't you start comparing yourself to products and technology that serve the edge

The vast majority of people and customers compare NATS to Kafka and the variety of variants out there. Once the push on edge occurs, I suspect comparison to these other tech will occur.

To be clear, I am not looking for a "winner" in this discussion, rather my original comment was to correct a gap in understanding of what NATS is capable of.

SpaghettiX · on Feb 4, 2023

Thanks

It's nice to hear that NATS is tackling this problem space. I'll give it a try.

> I disagree that simply because it is not advertised as a "edge" technology that it is not one of the best-in-class techs for edge. It simply means, we are doing a poor job at awareness.

It's very easy to say "NATS is for everything", which seems to be the case here. It would be great if that was backed up with evidence.

bruth · on Feb 4, 2023

It is a top priority. I do appreciate the constructive criticism and conversation. Feel free to reach out to me (byron at synadia dot com) or on the NATS Slack (https://slack.nats.io).

autobeam · on Feb 4, 2023

actually, NATS can be used in the browser in a way that eliminates the need to REST and offers both client/server communication as well as realtime functionality, a big benefit of that that full stack developers, will be able to communicate between web app and backend services in the same way that services talk to each other in a secure and reliable way, i wrote a blog post about that here: https://www.ahmed.wiki/blog/nats-more-productivity-client-de...

Not saying that driftDB is not cool, it is a nice tool, the point is that NATS has a great way of streamlining the communication between all components of a system including client apps, (react + flutter) in my case

SpaghettiX · on Feb 4, 2023

> a big benefit of that that full stack developers, will be able to communicate between web app and backend services in the same way that services talk to each other in a secure and reliable way

In practice, I have not seen that manifest as a benefit. Services have a dramatically different environment than edge devices. Tools built for services (NATS, Kafka, gRPC) do not translate well to the edge. The latter is used by a group of people who don't care or understand the edge-edge-cases: when a user drives through a tunnel and is disconnected, or when they restarts their device, or is throttled by an OS, etc. One issue I found with grpc-web (the alternative to grpc that supports browsers) is that it's severely limited by connection count by the browser - making streams completely useless). Also, grpc-web is neglected by Google.

NATS does not look ready, and is not designed for use, in browsers or mobile apps. It's not a use case that Synadia/NATS care much enough to even mention on their website.

> (react + flutter) in my case

How are you using NATS in Flutter? Using the client library that hasn't been updated in 20 months, with no link the repository and 4 upvotes? Or writing your own custom library to connect using websockets. If you use websockets directly, you'll be writing extra code to handle disconnections, retries, qos, etc.

paulgb · on Feb 5, 2023

How do you prevent a malicious user from taking the wss endpoint you’re using, subscribing to a wildcard, and seeing messages intended for other clients?

SpaghettiX · on Feb 4, 2023

They also don't list themselves as competitors to products that do target this market: ably, firebase messaging, pubnub, pusher. Instead, they compare themselves to Kafka, RabbitMQ, Pulsar, and gRPC, none of which work well on browsers. (Yes grpc-web "works" on the browser, but I suggest everyone avoids it.)

https://docs.nats.io/nats-concepts/overview/compare-nats

paulgb · on Feb 4, 2023

NATS is great, we use it in another project, Plane (https://plane.dev). The reasons I didn’t use it instead of making DriftDB:

- In NATS, unless you set up authentication, any user can subscribe to “>” and get a firehose of every message, even if they don’t know the room IDs.

- NATS Jetstream supports rollups, but they roll up the entire stream, rather than up to a certain sequence number. This would break our ability to do leaderless compaction.

bruth · on Feb 4, 2023

Was not aware of Plane, nice! Regarding the two points:

- A unique room ID/subject is a form of authentication. Essentially anyone having that unique identifier can join, akin to a token. This is straightforward to setup in NATS avoiding the ">" for all problem (which I may now need to write a blog post about ;-)

- Rollups are supported on a per-subject basis. Each room could be modeled as a subject and individually rolled up.

paulgb · on Feb 5, 2023

Re. #1, can you elaborate? The tracking issue for this is still open[1]. As far as I can tell this is a hard blocker for any use case where a random user on the web can connect to NATS, since it means that user can wiretap any room without knowing the room ID.

Re #2, the problem is that a rollup of a subject in NATS rolls up the whole subject, so there’s a race condition if you try to use it the way DriftDB uses it. If one client is computing a compaction while another client sends a message, that message will be erased by the compaction.

This works if a single producer is writing to a stream, because that producer can stop emitting messages during the compaction. But in our case, each client can produce messages at any time.

DriftDB solves this by sending a sequence number alongside the rollup of the last message included in the rollup, and the server preserves messages after that sequence number.

[1] https://github.com/nats-io/nats-server/issues/2667

bruth · on Feb 6, 2023

#1: This isn't a hard requirement to achieve the desired permissions. For the DriftDB use case, my understanding is that all members in the room have full pub/sub k/v permissions, so that could be achieved by declaring a new permission pinned to the room when the room is created or joined (this can be done dynamically without a config file reload).

#2 Publishes do support optimistic concurrency control using the `Nats-Last-Expected-Sequence` (stream level) or `Nats-Last-Expected-Subject-Sequence` for the subject-level. This ensures to concurrent publishes will be serialized and all but one is rejected with "conflict wrong sequence" error. For example headers in Rust[0] and WS[1]

[0]: https://docs.rs/async-nats/latest/async_nats/header/index.ht...

paulgb · on Feb 6, 2023

#1 your understanding of the DriftDB permission model is correct, but I’m not sure what declaring a new permission at runtime entails? Would I be creating a new bearer token for each room, and attaching the room’s permissions to it?

#2 the optimistic concurrency headers don’t solve the problem here, e.g.:

- I increment a counter (seq: 1)

- I increment a counter again (seq: 2, expected last sequence: 1)

- I begin computing a snapshot, resulting in a counter value of 2

- You increment the counter (seq: 3, expected last sequence: 2)

- I complete the snapshot and publish it with a Nats-Rollup header

- Your event has been lost, the counter value is now 2

bruth · on Feb 6, 2023

#1: Correct, I putting together an example this week to show what this looks like. Pretty straightforward.

#2: You can combine rollup and expected last sequence header to prevent this, unless I am missing another subtle detail?

(I am enjoying this thread FWIW :)

paulgb · on Feb 6, 2023

Cool, looking forward to the example :)

I actually didn’t realize that the expected last sequence could be combined with nats-rollup. In the example above (as I understand it) if we added that to the snapshot, NATS would throw out the snapshot, so the log would still be correct, but the work of snapshotting it and sending it over the wire would be lost. If messages were frequent enough, and/or snapshots took a while to compute/transfer, you might never have a roll-up succeed.

Our approach is that a roll-up will always succeed, we just preserve any messages in the stream with a sequence number greater than the one provided.

(I’m enjoying it too, and am a user of NATS so I’m happy to learn things I didn’t know about it :)

bruth · on Feb 6, 2023

> If messages were frequent enough, and/or snapshots took a while to compute/transfer, you might never have a roll-up succeed.

Yes, good point. The "snapshotting up to a lagging sequence" could be achieved with two separate subjects to reduce contention, but is a bit more work.

It sounds like, in Drift's case, the snapshot effectively brings up the tail (snapshot), but the head can still be appended to with new events.

paulgb · on Feb 6, 2023

> It sounds like, in Drift's case, the snapshot effectively brings up the tail (snapshot), but the head can still be appended to with new events.

Yep, exactly. It’s a subtle feature but it makes it possible for multiple clients to attempt to compact without worrying about races.

bruth · on Oct 8, 2022

Check out SimpleIoT: https://github.com/simpleiot/simpleiot

bruth · on March 23, 2022

If you have a low-volume, non-performant, non-critical database, k8s is fine. If you need it to perform and/or need built-in ops (managed backups or replication), use a managed service.

k8s _can_ do stateful, but if a managed service exists for this workload, use it. It is not about _can_ I run it on k8s, it is a _should_ question. Is it worth the cumulative effort required to achieve the same degree of quality.