Orcaman is a very straightforward implementation (just sharded RW locks and backing maps), but it limits the number of shards to a fixed 32. I wonder what the benchmarks would look like if the shard count were increased to 64, 128, etc.
It potentially still might make a difference due to reduced contention: if we have more shards the chances of two or more goroutines hitting the same shard would be lower. In my mind the only downside to having more shards is the upfront cost, so it might slow down the smallest example only
Were those 16 mln sessions used only for alignment, chat format, reasoning, etc.? Or it's possible to train a base model too? If a single session is at least 32k tokens, then it's already 0.5 trillion tokens to train on, interesting.
It's something we debated in our team: if there's an API that returns data based on filters, what's the better behavior if no filters are provided - return everything or return nothing?
The consensus was that returning everything is rarely what's desired, for two reasons: first, if the system grows, allowing API users to return everything at once can be a problem both for our server (lots of data in RAM when fetching from the DB => OOM, and additional stress on the DB) and for the user (the same problem on their side). Second, it's easy to forget to specify filters, especially in cases like "let's delete something based on some filters."
So the standard practice now is to return nothing if no filters are provided, and we pay attention to it during code reviews. If the user does really want all the data, you can add pagination to your API. With pagination, it's very unlikely for the user to accidentally fetch everything because they must explicitly work with pagination tokens, etc.
Another option, if you don't want pagination, is to have a separate method named accordingly, like ListAllObjects, without any filters.
Returning an empty result in that case may cause a more subtle failure. I would think returning an error would be a bit better as it would clearly communicate that the caller called the API endpoint incorrectly. If it’s HTTP a 400 Bad Request status code would seem appropriate.
>allowing API users to return everything at once can be a problem both for our server (lots of data in RAM when fetching from the DB => OOM, and additional stress on the DB)
You can limit stress on RAM by streaming the data. You should ideally stream rows for any large dataset. Otherwise, like you say you are loading the entire thing into RAM.
Buffering up the entire data set before encoding it to JSON and sending it is one of the biggest sources of latency in API based software. Streaming can get latencies down to tens of microseconds!
I like your thought process around the ‘empty’ case. While the opposite of a filter is no filter, to your point, that is probably not really the desire when it comes to data retrieval. We might have to revisit that ourselves.
how about returning an error ? It’s the generic “client sent something wrong” bucket. Missing a required filter param is unambiguously a client mistake according to your own docs/contract → client error → 4xx family → 400 is the safest/default member of that family.
I run Qwen3-32b locally without any tools (just llama.cpp) and it can do basic arithmetic for smaller numbers ( like 134566) but I didn't check it for much larger numbers. I'm not at the PC right now but trying to do it via OpenRouter on much larger numbers overflows the context and it stops without giving a result :)
Not a security researcher, but I once found an open Redis port without auth on a large portal. Redis was used to cache all views, so one could technically modify any post and add malicious links, etc. I found the portal admin's email, emailed them directly, and got a response within an hour: "Thanks, I closed the port." I didn't need a bounty or anything, so sometimes it may be easier and safer to just skip all those management layers and communicate with an actual fellow engineer directly
Dunno, in my Go+HTMX project, it was pretty trivial to add SSE streaming. When you open a new chat tab, we load existing data from the DB and then HTMX initiates SSE streaming with a single tag. When the server receives a SSE request from HTMX, it registers a goroutine and a new Go channel for this tab. The goroutine blocks and waits for new events in the channel. When something triggers a new message, there's a dispatcher which saves the event to the DB and then iterates over registered Go channels and sends the event to it. On a new event in the tab's channel, the tab's goroutine unblocks and passes the event from the channel to the SSE stream. HTMX handles inserting new data to the DOM. When a tab closes, the goroutine receives the notification via the request's context (another Go primitive), deregisters the channel and exits. If the server restarts, HTMX automatically reopens the SSE stream. It took probably one evening to implement.
8b models are great at converting unstructured data to a structured format. Say, you want to transcribe all your customer calls and get a list of issues they discussed most often. Currently with the larger models it takes me hours.
A chatbot which tells you various fun facts is not the only use case for LLMs. They're language models first and foremost, so they're good at language processing tasks (where they don't "hallucinate" as much).
Their ability to memorize various facts (with some "hallucinations") is an interesting side effect which is now abused to make them into "AI agents" and what not but they're just general-purpose language processing machines at their core.
>Pizlix is LFS (Linux From Scratch) 12.2 with some added components, where userland is compiled with Fil-C. This means you get the most memory safe Linux-like OS currently available.
I'm aware of Pizlix - it's a good project/idea that needs to go mainstream; as you mention, memory safety is currently limited to userland (still a huge improvement over traditional unsafe userland.)
Note also that it uses fil-c rather than clang with -fbounds-safety. I believe fil-c requires fewer code changes than -fbounds-safety.
https://github.com/hsaliak/filc-bazel-template i created this recently to make it super easy to get started with fil-c projects. If you find it daunting to get started with the setup in the core distribution and want a 3-4 step approach to building a fil-c enabled binary, then try this.
>For example, Go uses a non-copying GC because Google wanted it to work with their existing C++ code more easily. Copying GCs are hard to get 100% correct when you're dealing with an outside runtime that doesn't expect things to be moved around in memory.
Do you have a source for this?
C# has a copying GC, and easy interop with C has always been one of its strengths. From the perspective of the user, all you need to do is to "pin" a pointer to a GC-allocated object before you access it from C so that the collector avoids moving it.
I always thought it had more to do with making the implementation simpler during the early stages of development, with the possibility of making it a copying GC some time in the feature (mentioned somewhere in stdlib's sources I think) but it never came to fruition because Go's non-copying GC was fast enough and a lot of code has since been written with the assumption that memory never moves. Adding a copying GC today would probaby break a lot of existing code.
>What's dispiriting is the (lack of) process and care: take someone's carefully crafted work, run it through a machine to wash off the fingerprints, and ship it as your own.
"Don't attribute to malice what can be adequately explained by stupidity". I bet someone just typed into ChatGPT/Copilot, "generate a Git flow diagram," and it searched the web, found your image, and decided to recreate it by using as a reference (there's probably something in the reasoning traces like, "I found a relevant image, but the user specifically asked me to generate one, so I'll create my own version now.") The person creating the documentation didn't bother to check...
In this case, we can chalk it up to malicious stupidity. Someone posting a reference aimed at learners, especially with Microsoft's reach and name recognition, has a responsibility to check the quality and accuracy of the materials. Using an AI tool doesn't absolve that responsibility one bit.
reply