More

twic · 2026-03-14T14:56:17 1773500177

FWIW you can do a better job with the JSON structure than in the article:

    {"GreaterOf": [
        {"Value": [0, "Dollar"]},
        {"Subtract": [
            {"Dependency": ["/totalTentativeTax"]},
            {"Dependency": ["/totalNonRefundableCredits"]}
        ]}
    ]}

Basically, a node is an object with one entry, whose key is the type and whose value is an array. It's a rather S-expressiony approach. if you really don't like using arrays for all the contents, you could always use more normal values at the leaves:

    {"GreaterOf": [
        {"Value": {"value": 0, "kind": "Dollar"}},
        {"Subtract": {
            "minuend": {"Dependency": "/totalTentativeTax"},
            "subtrahend": {"Dependency": "/totalNonRefundableCredits"}
        }}
    ]}

It has the nice property that you're always guaranteed to see the type before any of the contents, even if object keys get reordered, so you can do streaming decoding without having to buffer arbitrary amounts of JSON. Probably not important when parsing a tax code, but can be useful for big datasets.

foltik · 2026-03-14T16:38:19 1773506299

Agreed. Any language that wants to use the fact graph is going to have to “interpret” the chosen DSL anyways, and JSON is more ubiquitous and far simpler to parse than XML. Also way cheaper in the sense that the article uses it (how many langs can you parse and walk an XML document in off the top of your head? what about JSON?)

To see why JSON is simpler, imagine what the sum total of all code needed to parse and interpret the fact graph without any dependencies would look like.

With XML you’re carrying complex state in hash maps and comparing strings everywhere to match open/close tags. Even more complexity depending on how the DSL uses attributes, child nodes, text content.

With JSON you just need to match open/close [] {} and a few literals. Then you can skim the declarative part right off the top of the resulting AST.

It’s easy to ignore all this complexity since XML libs hide it away, and sure it will get the job done. But like others pointed out, decisions like these pile up and result in latency getting worse despite computers getting exponentially faster.

y1n0 · 2026-03-14T19:25:35 1773516335

What I don't like are all the freaking quotes. I look at json and just see noise. Like if you took a screenshot and did a 2d FFT, json would have tons of high frequency content relative to a lot of other formats. I'd sooner go with clojure's EDN.

y1n0 · 2026-03-14T21:11:21 1773522681

So I generated a tool to take a screenshot of text and do a 2d FFT on it so I could take my own comment literally.

I was wrong. There is seemingly more high frequency content in the xml. See [1] -- the right side is the xml.

[1] https://orbitalchicken.com/fft_formats.jpg

g947o · 2026-03-14T20:35:40 1773520540

Eh. I doubt if human developers spend much time reading any such json files.

Using jq etc will go a long way for any routine work.

y1n0 · 2026-03-14T21:13:40 1773522820

We do where I work and I hate it.

phlakaton · 2026-03-14T16:53:48 1773507228

Aesthetically, I consider such JSON structures degenerate. It's akin to building a ECMAScript app where every class and structure is only allowed to have one member.

If you want tagged data, why not just pick a representation that does that?

foltik · 2026-03-14T17:16:47 1773508607

Because (imo) the goal should be to minimize overall complexity.

Pulling in XML and all of its additional complexity just to get a (debatably) cleaner way to express tagged unions doesn’t seem like a great tradeoff.

I also don’t buy the degenerate argument. XML is arguably worse here since you have to decide between attributes, child nodes, and text content for every piece of data.

phlakaton · 2026-03-14T17:22:03 1773508923

Depends on the application, I suppose. For OP's application, pulling in XML is no trouble and gives you a much better solution for typed unions.

To get better than XML, I think you're looking at something closer to a Haskell- or LISP-embedded DSL, with obvious trade-offs when it comes to developer ecosystems and interoperability.

imtringued · 2026-03-15T10:18:42 1773569922

If your concern can be addressed by using an array, I don't really find it to be such a compelling argument.

twic · 2026-03-14T14:41:58 1773499318

You don't even need to specify a DSL to make that code declarative. It can be real code that's manipulating expression objects instead of numbers (though not in JavaScript, where there's no operator overloading), with the graph of expression objects being the result.

twic · 2026-03-14T13:16:11 1773494171

I think you're talking about microphonics:

https://zikman.audio/blog/cable-microphonics-what-it-is-and-...

Also sometimes called the stethoscope effect.

I think strictly speaking, this isn't actually microphonics, because that means that mechanical noise causes electrical noise, which then results in audible noise, whereas what is happening is just transmission of vibrations up the cable into the ear.

Anyway, it can be fixed with better cables. They don't have to be fancy (they don't have to be the 349 euro cables that site is selling!) - i have a pair of KZ ZS10 Pro X earphones, and using the stock cables, i don't get rustling through those.

(more generally, i have an embarrassing number of Chi-Fi earphones, and don't get rustling with any of them)

alpineman · 2026-03-14T14:27:13 1773498433

Thanks! I'm probably just being cheap with my 20 euro headphones and need to upgrade to at least mid-range

twic · 2026-03-14T19:18:44 1773515924

Well, those KZ ZS10 Pro X are still fairly cheap (40 euros), and i really like them. But there is a huge range of amazing value for money earphones out there. You just have to wade through dozens of pages of forum posts and Reddit threads to find them.

twic · 2026-03-14T11:57:09 1773489429

> This solution looks extremely similar to the previous one, which is a good thing. Our requirements have experienced a small change (reversing the traversal order) and our solution has responded with a small modification.

Now do breadth-first traversal. With the iterative approach, you just replace the stack with a queue. With the recursive approach, you have to make radical changes. You can make either approach look natural and elegant if you pick the right example.

aleph_minus_one · 2026-03-14T13:02:02 1773493322

> Now do breadth-first traversal. With the iterative approach, you just replace the stack with a queue. With the recursive approach, you have to make radical changes.

The reason is that no programming language that is in widespread use has first-class support for co-recursion. In a (fictional) programming language that has this support, this is just a change from a recursive call to a co-recursive call.

Chinjut · 2026-03-14T13:53:05 1773496385

Haskell (I realize this may not pass your threshold for widespread use) has equal support for co-recursion as for structural recursion.

twic · 2026-03-14T18:44:23 1773513863

Right, you could use co-recursion. Or you could just use a queue.

naasking · 2026-03-14T15:02:28 1773500548

True, but couldn't you just simulate it by enqueing a thunk/continuation?

mkbosmans · 2026-03-15T09:59:25 1773568765

No need for radical changes.

  def visit_bf(g):
    n, children = g
    yield n
    if children:
        iterators = [iter(visit_df(c)) for c in children]
        while iterators:
            try:
                yield next(iterators[0])
            except StopIteration:
                iterators.pop(0)
            iterators = iterators[1:] + iterators[:1]

The difference between DFS and BFS is literally just the last line that rotates the list of child trees.

Python is a pretty mainstream language and even though the DFS case can be simplified by using `yield from` and BFS cannot, I consider that just to be syntactic sugar on top of this base implementation.

_dain_ · 2026-03-15T13:23:47 1773581027

Oh wow, I've never seen that "list of iterators" trick before. I always thought you needed an explicit queue for breadth-first.

Thanks!

twic · 2026-03-14T11:48:53 1773488933

> It seems to be common knowledge that any recursive function can be transformed into an iterative function.

Huh. Where i work, the main problem is that everyone is hell-bent on transforming every iterative function into a recursive function. If i had a pound for every recursive function called "loop" in the codebase, i could retire.

skeeter2020 · 2026-03-14T14:59:41 1773500381

My experience has gone the other way: lots of code with recursion, rewritten to be iterative. There really aren't that many use-cases in vanilla enterprise code that benefit from recursion when the entire cost is considered.

adafactor · 2026-03-14T17:12:42 1773508362

Where on earth do you work? This is unusual...

twic · 2026-03-14T11:45:38 1773488738

Stack Overflow does this and it works far better than arbitrary tyrant style moderation.

Shog9 · 2026-03-14T15:45:59 1773503159

Crucially, SO's election system needs to be bootstrapped: users aren't eligible to vote until they have a history of participation. The level of participation is fairly trivial, but it provides enough signal to allow a reasonable detection (and elimination) of bot / sock puppet networks without resorting to crude measures like blacklists or "bot tests".

For new sites, this meant that the bulk of moderation was done by employees, followed by employee-appointed temporary moderators. This dramatically reduced abuse, but also reduced the explosion of new sub-communities that sites like Reddit thrived on.

gzread · 2026-03-14T12:25:25 1773491125

Stack Overflow is dead now.

PaulHoule · 2026-03-14T13:44:02 1773495842

I don’t think it was ever very good.

freedomben · 2026-03-14T16:51:41 1773507101

It was pretty decent in the mid and late 00s. The community started turning toxic in the very early 10s and by about 2015 was quite poisonous. The saddest part is that the problem was known and spoken about frequently, but the response to that from staff and/or high-level mods was to just double down and dig in.

9rx · 2026-03-14T18:47:26 1773514046

I'm too old, but it seemed like it would be decent for a beginner in the mid-to-late 00s. But it never handled advanced, difficult topics very well.

freedomben · 2026-03-14T23:31:52 1773531112

For sure, advanced difficult topics were never really their forte', although it was really common to get great book or blog recommendations via comments. For me, the golden combination was a good book on the language/framework/topic I was stuyding, supplemented with specific Q&A from Stack Overflow. I have extremely fond memories learning C++ and Qt that way (although that Qt book was a little rough, but at least there was a Qt book. Nowadays every book just seems too outdated to be helpful).

gzread · 2026-03-14T14:57:17 1773500237

Probably, but now it's actually dead by all the metrics. People ask LLMs instead because they won't close their questions.

twic · 2026-02-22T22:33:27 1771799607

VoltDB took this to an extreme - the way you interact with it is by sending it some code which does a mix of queries and logic, and it automatically retries the code as many times as necessary if there's a conflict. Because it all happens inside the DBMS, it's transparent and fast. I thought that was really clever.

I'm using the past tense here, but VoltDB is still going. I don't think it's as well-known as it deserves to be.

zadikian · 2026-02-23T01:46:48 1771811208

Interesting. How is that faster than just having the code running on the same machine as the DB? Guess it could be smarter about conflicts than random backoff.

twic · 2026-02-21T19:52:01 1771703521

Usually, you can. But occasionally you get mildly defective tools that require some directory to exist, even though it's empty. It's easier to add a gitkeep than fix them.

taftster · 2026-02-21T21:29:36 1771709376

This used to happen a lot. But I don't think that many modern builders require existing directory these days.

Your point is valid though. It would be much preferable to include build/ in your root .gitignore so that the directory is never tracked.

twic · 2026-02-21T12:30:51 1771677051

See also OpenRewrite:

https://github.com/openrewrite/rewrite

And i assume any large organisation running a monorepo has some vaguely equivalent tooling for making mass changes. Have any of them published about that?

karlding · 2026-02-22T07:17:28 1771744648

You can write automated refactoring with clang tools if you need AST-level knowledge across your project (or monorepo).

I’m not sure if there’s other public examples leveraging this, but Chromium has this document [0] which has a few examples. And there’s also the clang-tidy docs [1].

[0] https://chromium.googlesource.com/chromium/src/+/80a6fc33dee...

[1] https://releases.llvm.org/21.1.0/tools/clang/tools/extra/doc...

conartist6 · 2026-02-21T13:19:32 1771679972

This is a business that I suspect may not survive BABLR.

> Moderne's build plugins allow for LSTs to be serialized to disk. This makes the process of consuming and editing large quantities of them much more efficient. OpenRewrite's build plugins, on the other hand, store everything in memory and need to be reparsed every time there is a change.

So yeah I'm giving away open standards to everyone for free that do the thing they expect people to pay them for...

rzzzt · 2026-02-21T14:08:31 1771682911

What's BABLR?

cstrahan · 2026-02-21T19:01:12 1771700472

https://bablr.org/

> The next-gen LR parser framework for creating elegant and efficient language tools

> BABLR is a new kind of thing that does not quite fit into any category of things that has existed before it. In purpose it is made to be an instrument of code literacy -- a unified toolchain for software developers that supports a new generation of richly visual interfaces for coding. In form BABLR is a collection of scripts and virtual machines written in plain Javascript that run in almost any modern web browser. BABLR is also a community and an ecosystem, including a small but rapidly growing collection of ready-to-use parsers for popular languages.

twic · 2026-02-21T19:42:37 1771702957

At first brush, everything about this sounds like overly ambitious vapourware. Is there a reason to think this is going to deliver? People involved, what's already shipped, etc?

I particularly loved this from their roadmap:

> Completed

> Shift operation

> Enables LR parsing of expressions like 2+2

Being able to parse 2 + 2 is definitely good!

And their thoughts on testing:

> How our project reaches production stability is a process that often surprises people. We don't write a lot of tests for example, and we often don't do much testing before we ship releases. Instead we test exhaustively after we ship releases, which is the only way we know of knowing for sure that the product we shipped does what we think it does. [...] We also don't (usually) practice TDD. If you look at the number of tests we have, it likely won't seem like it's anywhere near enough to keep a project of this size stable! The secret sauce here is that our key invariants aren't written in our test files, they're baked into the core of the implementation. Every time you use the code, you're essentially testing it. To gain confidence in our core, we simply try to use it to do a lot of real work.

Man, why did i not think of that, i could have got out of writing so many tests if i'd just baked the invariants into the core of the implementation!

conartist6 · 2026-02-21T20:15:05 1771704905

In this case the tool is meant to parse programming languages, so once I write some parser grammars every valid code file in existence is a test case. Seen that way I have more test cases than I know what to do with.

We've come a ways from 2 + 2. This week my goal is to feed our own whole codebase through the JS parser, and I should be able to. I managed to parse a few hundred lines of real JS last week before running into Automatic Semicolon Insertion trouble that I needed to tinker with the core to fix.

While I get that our low profile smacks of vapor, we actually have working packages published: bablr and @bablr/cli. I'd consider them to be beta quality right now, having gone through many previous releases that I'd only consider alpha-quality, and even more releases before that.

conartist6 · 2026-02-21T21:05:15 1771707915

It's not too hard to verify my central claim here which is that we're giving away what they charge money for. Their serialization format is secret, proprietary. Ours, CSTML, is open: https://docs.bablr.org/guides/cstml. Their free product make you re-parse the entire project with every code change you make. Ours is built with copy-on-write immutable data structures so that you can always build new things without losing old ones. Our way you can compose fragments of trees together with new code into new trees like you're playing with lego bricks.

conartist6 · 2026-02-21T15:29:49 1771687789

The mission is the same as OpenRewrite: parse and transform any code.

twic · 2026-02-21T12:30:09 1771677009

According to https://coccinelle.gitlabpages.inria.fr/website/ce.html :

> Nevertheless, detecting the holding of locks requires a careful and occasionally interprocedural analysis of the source code, and the other conditions, such as "in a completion handler", are not formally defined and require study of multiple files.

> Due to the complexity of the conditions governing the choice of new argument for usb_submit_urb, 71 of the 158 calls to this function were initially transformed incorrectly to use GFP_KERNEL instead of GFP_ATOMIC.

Okay, but how does Coccinelle help? Is it able to do this careful and not formally defined analysis? Or does it automate the undifferentiated heavy lifting and so make it easier for humans to do it?