See https://abseil.io/tips/ for some idea of the kinds of guidance these kinds of teams work to provide, at least at Google. I worked on the “C++ library team” at Google for a number of years.
These roles don’t really have standard titles in the industry, as far as I’m aware. At Google we were part of the larger language/library/toolchain infrastructure org.
Much of what we did was quasi-political … basically coaxing and convincing people to adopt best practices, after first deciding what those practices are. Half of the tips above were probably written by interested people from the engineering org at large and we provided the platform and helped them get it published.
Speaking to the original question, no, there were no teams just manually reading code and looking for mistakes. If buggy code could be detected in an automated way, then we’d do that and attempt to fix it everywhere. Otherwise we’d attempt to educate and get everyone to level up their code review skills.
> Half of the tips above were probably written by interested people from the engineering org at large and we provided the platform and helped them get it published.
Are you aware how those engineers established their recommendations? Did they maybe perform case studies? Or was it more just a distillation of lived experience type of deal?
Some popular streamers have dabbled in OCaml this year, sometimes calling it "the Go of functional programming", which probably set off a small wave of people tinkering with the language. OCaml has also gotten gradually better in recent years in terms of tooling, documentation, standard library, etc.
I think they were saying that Gleam was Go of functional programming? OCaml may be like Go compared to Haskell but IMHO Gleam really embraces simplicity and pragmatism.
I would say some other reasons OCaml is similar to Go is that the runtime is very simple, performance is on par and the compilation times are very fast. It also markets itself as a GC'd systems language similar to Go. I think a seasoned OCaml would be able to guess the generated assembler code.
I suspect that Gleam is quite different in that regard.
In my experience learning a bit of OCaml after Rust, and then looking at Haskell, the three aren't all that different in terms of the basics of how ADTs are declared and used, especially for the simpler cases.
Another way of phrasing my query is that given these are all basically ML-style constructs, why would the examples not be ML? And I was assuming the answer to that is "the sorts of people reading these blogs in 2024 are more familiar with Rust"
I think a second reason might be that translating OCaml/Haskell concepts to Python has that academic connotation to it. Rust also (thanks to PyO3) has more affinity to Python than the ML languages. I guess it isn't a surprise that this post has Python, C++, and Rust, all "commonly" used for Python libraries.
> Bad example. Google docs doesn’t use CRDTs but uses OT instead. CRDTs may handle your scenario just fine depending on how they decide to handle this scenario.
The CRDT may pick one or the other replacement word, but who is to say that either choice is correct? Perhaps including both words is correct.
> Then there’s not even a merge conflict...
Agree, this is what CRDTs are all about.
> ...to really worry about.
I think it is important to make clear that CRDTs do not "solve" the merging problem, they merely make it possible to solve in a deterministic way across replicas.
Often, CRDTs do not capture higher level schema invariants, and so a "conflict free" CRDT merge can produce an invalid state for a particular application.
There is also the example above, where at the application level, one particular merge outcome may be preferred over another.
So, it isn't as simple as having nothing to worry about. When using CRDTs, often, there are some pretty subtle things that must be worried about. :-)
I don't agree that a missing "framework" is the whole of the problem. It just isn't that simple.
Sure, people need to use resiliency skills to cope with the stresses of life. Often times, this is an important part of what therapy for depressed people is trying to achieve.
But this isn't to say that there isn't a constellation of causes in recent decades and years that cause the world to be particularly stressful, especially for young people. It also isn't to say that we should dismiss what is occurring in the world today as "the same old stuff" without acknowledging that it may actually have unique properties worth understanding. Off the top of my head: world population is at an all-time high, global warming is becoming increasingly understood, it is increasingly acknowledged that we can no longer simply extract unlimited resources from the earth to solve all problems, the Internet has changed the way the world works that seems to speed everything up: communication, changes within social groups, larger societal shifts, economic change, etc.
I must agree that it is not that simple. That would be highly unlikely.
But how does one measure the impact of recent changes, such as the rise of the internet? Did the invention of the crossbow, the invention of money, of language, of the wheel, not also impact our lives in dramatic ways?
World population has almost constantly been at an all-time high, because it is mostly increasing.
It sure may feel different this time, but if you read the Book of Revelation, or consider 14th century pandemics, our current situation looks like child's play to me.
> It's much more a matter of whether you want to do something small scale and fun, or whether you want to suck all the joy out of it by applying the same soul crushing constraints we already get paid to do in our day jobs. Bleh.
Amen. And further, what better prepares a programmer to assess the relative costs of implementing a thing vs using a library providing that thing than having attempted an implementation?
Learning by doing is a valid approach, and this can even be called fun.
Definitely interested in how you achieved another 2-10x over the btree approach. I want surprised that btree was as effective as it was, but I’d be curious to know how you squeezed a bit more out of it.
The btree works great, and has barely changed. I made it faster with two tricks:
1. I made my own rope library (jumprope) using skip lists. Jumprope is about 2x faster than ropey on its own. And I have a wrapper around the skip list (called “JumpropeBuf” in code) which buffers a single incoming write before touching the skip list. This improves raw replay performance over ropey by 10-20x iirc.
2. Text (“sequence”) CRDTs replicate a list / tree of fancy “crdt items” (items with origin left / origin right / etc). This special data structure needs to be available both to parse incoming edits and generate local edits.
Turns out that’s not the only way you can build systems like this. Diamond types now just stores the list of original edits. [(Edit X: insert “X” position 12, parent versions Y, Z), …]. Then we recompute just enough of the crdt structure on the fly when merging changes.
This has a bunch of benefits - it makes it possible to prune old changes, it lowers memory usage (you can just stream writes to disk). The network and disk formats aren’t dependant on some weird crdt structure that might change next week. (Yjs? RGA? Fugue?). File size is also smaller.
And the best bit: linear traces don’t need the btree step at all. Linear traces go as fast as the rope. Which - as I said above, is really really fast. Even when there are some concurrent edits and the btree is created, any time the document state converges on all peers we can discard all the crdt items we generated so far and start again. Btrees are O(log n). This change essentially keeps resetting n, which gives a constant size performance improvement.
The downside is that the code to merge changes is more complex now. And it’s slower for super complex traces (think dozens of concurrent branches in git).
I’m writing a paper at the moment about the algorithm. Should be up in a month or two.
Those are good things to consider in review, but I maintain that the answer might be "no" to one or more of those questions and still be acceptable.
I'm old enough to have worked in the pre-code-review era. Things were fine. People still learned from each other, software could still be great or terrible, etc. It wasn't appreciably worse or better than things are today.
> An implicit question in several of the above is "will this set a good example for future contributions?"
Which in my experience can be an almost circular requirement. What do you consider a good example? As perfect as perfect can be? Rapid development? Extreme pragmatism?
The more experienced I get, the less I complain about in code review, especially when reviewing for a more junior dev, and especially for frequent comitters. People can only get so much out of any single code review, and any single commit can only do so much damage.
Put another way, code review is also about a level of trust. Will the committer be around next week? Are they on the same team as me? If yes, give them some leeway to commit incremental work and make improvements later. Not all incremental work need occur pre-commit. Mention areas for improvement, sure, but don't go overboard as a gatekeeper.
Things are obviously going to be different when reviewing code from what amounts to a stranger on a mission critical piece of code, etc.
> Put another way, code review is also about a level of trust. Will the committer be around next week? Are they on the same team as me? If yes, give them some leeway to commit incremental work and make improvements later. Not all incremental work need occur pre-commit. Mention areas for improvement, sure, but don't go overboard as a gatekeeper.
I think this is very important, especially the part about incremental improvement. too many see development as laying concrete where it has to be perfect rather than as an ongoing process.
and personally the only thing I find PR's good for is ensuring jackasses aren't doing stupid shit. And by stupid shit here I mean things like using floats for currency (I caught that w/i the last year), things of that nature.
But my preference is to work with people I can trust and at that point I don't give a crap about a PR or a code review.
Given equivalent data stored in both JSON and BSON format I would expect them both to compress down to blobs of roughly equivalent sizes. This is because both encode roughly the same amount of information, so the compression algorithm will tend to compress down to the same final result. I haven't run this as an experiment though..that would be fun.
> I am more excited by Emacs 30, mostly I like the ahead of time compilation of Lisp code.
Emacs has had an ahead of time compilation feature, "Native Compilation," since Emacs 28. Is this what you mean? Or is the Emacs 30 feature something different?
These roles don’t really have standard titles in the industry, as far as I’m aware. At Google we were part of the larger language/library/toolchain infrastructure org.
Much of what we did was quasi-political … basically coaxing and convincing people to adopt best practices, after first deciding what those practices are. Half of the tips above were probably written by interested people from the engineering org at large and we provided the platform and helped them get it published.
Speaking to the original question, no, there were no teams just manually reading code and looking for mistakes. If buggy code could be detected in an automated way, then we’d do that and attempt to fix it everywhere. Otherwise we’d attempt to educate and get everyone to level up their code review skills.
reply