> For a large change, I don’t want to do a “diff review”, I want to do a proper code review of a codebase at a particular instant in time, paying specific attention to the recently changed areas,
obviously every team and ticket is different, but IMO unless the person doing the review is some sort of principal engineer mostly responsible for the code at large, this does not align with I would personally consider a code review. In my book a general code review is simple sanity check by a second pair of eyes, which can result in suggestions to use different API or use an API slightly differently. If commits in the PR are properly ordered/squashed it is relatively easy to review incremental changes in isolation anyway.
I guess the complaint author has is really about review types. I do not have terminology ready at hand, but there is run of the mill sanity check review and then there is deep, architectural feature premerge review. The author complains about the latter when most of the code reviews in web tools are of the former variety.
> In my book a general code review is simple sanity check by a second pair of eyes, which can result in suggestions to use different API or use an API slightly differently.
This is an impoverished view of code review. Code review is a principal mechanism for reducing individual code ownership, for propagating conventions, and for skill transfer.
A good code review starts with a good PR: one that outlines what its goals were and how it achieved them.
First item on a good review then: does the code achieve what it set out to do per the outline?
Second: does it contain tests that validate the claimed functionality?
Third: does it respect the architectural conventions of the system so far?
Fourth: is the style in line with expectations?
Fifth, and finally: do you have any suggestions as to better API usage?
A code review that is nothing more than a sanity check is useless and could have been done by CI infrastructure. Code review is a human process and should maximally take advantage of the things only humans could do. Leave the rest to machines.
An implicit question in several of the above is "will this set a good example for future contributions?"
On the one hand, I really like constant deep feedback. I really like the consistency benefits of having another person say “that’s too much, I find that unreadable.”
On the other, I have now been at a lot of places where it was very hard to get my code reviewed, latencies of days and sometimes weeks if folks are in a particularly heinous crunchtime... And then when it does get reviewed, the stuff that I worked really hard on to get the right speed or to properly centralize cache invalidation... Suddenly someone is like “I would have done it from this other approach” and you have no idea whether it's tractable or not.
While I have never been at a place that did this, I have in my head the idea that the code should be an unfolding collective conversation, kind of like when folks are all collaborating on a shared Google Doc, I see that you are editing this section and I throw in a quick comment “don't forget to add XYZ” and then jump to a different part that I won't be stepping on their toes with. So the basic idea would be to get everybody to merge to `main` like 2 or 3 times a day if possible. In that case code review really is just “make sure this doesn't break the build or break prod, everything is behind a feature toggle, if I don't like the design I will comment on the code near the design or sketch a proof of concept showing that my approach is superior in clarity or speed or whatever”... Nobody ever takes me up on this culture shift it seems.
In this kind of context, I ask people what log level they'd like their review at. If you just want to get the code out the door, by all means, "error" or "warn" might be the right review depth, when you're confident in your code and don't want to be derailed with philosophy.
If you're exploring a new concept and want all the ideas and brainstorming you can get in your feedback, "debug" log level is appropriate.
Once that idea has moved down the pipe, you may be down to "info" or "warn" depending on how much conversation has happened around the PR.
This is called Continuous Integration, and its a shame that the term got nicked to now mean "that thing that builds our PRs somewhere".
The idea was that everyone pushes to main all the time, which basically reduces integration time to 0, as everyone is doing it every couple of minutes on big teams. After a while you learn how to not step on people's toes (Introduce new classes incrementally, use docblocs documenting class' intent, rather than just current functionality)
I too find that its basically impossible to suggest switching to this workflow, given the weight of all our existing tools. They are so easy to setup, and most come for free (Azure Devops/Pipelines thing) that going off the track is just unthinkable.
Everyone committing to the main branch all the time sounds like a nightmare to me, to be honest. I would probably seriously consider quitting if this was forced on me.
(Or rather, I would probably just run on my private git copy, and only pull every once in a while, and ignore that the main branch always changes.)
When / how do you do code review in your suggested workflow?
To be fair. I have never really heard of pushing to master as a CI technique for git. You would push to your own publicly visible branch and some script would automatically attempt to merge into a CI branch unless there's merge conflicts in which case you get notified. Then as you add things you get notified if the whole thing is breaking and get made aware of upcoming issues.
Yes, I would not advocate for that either. I like having PRs be a more complete thought, ideally, though sometimes you're left with 3 PRs for "one change" if you need e.g. a database change, a migration for the data, and changing the code to use the new column, which is frustrating but doable.
I'd like my PRs to tell a very sanitised story of how I could have come up with the change, with the power of foresight.
Basically, first you write your code however you see fit. Then you use git to rewrite history to make the reviewers life easy, and then you give it to the reviewer.
The reviewer doesn't need to know that my original version had a bug that I fixed later. I can make it look like I came up with a bug free version from the start.
I used to do that but these days I prefer the whole story be in there so I can reference it later if I run into the same bug in a different place.
In general I find my past efforts to maintain a clean git history were probably not that useful, as long as you're not running e.g. 4 or 5 branches in parallel with crossing history. Branch off, make change, merge back main, PR, is fine.
>would the tool also change master to the new merge commit
Not in my experience, no.
The goal is just to keep you aware of what problems you're likely to run into if you tried to push for your change to be merged in at the present moment. If these problems are due to work another team is doing the idea is then that you might begin conversations with that other team to discuss how best to sort these things out.
One of the easiest tools in this regard is just a “test” that blocks merging if the request is n commits behind master—just enforcing a rebase whenever you update a pull request. Combine with forbidding fast forward merges for pull requests (just let the code review tool generate a merge commit) and your `git log --graph` becomes much mor bounded in terms of how bad it can get.
My team uses this pattern and generally speaking it looks something like this
* everybody is good at breaking down code into changes small enough review quickly but not small enough to become tedious and trivial
* people work on CRs whenever they have time. They do generally not post them after people have started going offline, and wait til next morning as a courtesy, since there is no difference between publishing for review at 5pm vs 9am the next day
One side effect of this workflow is that because the pieces are more manageable, less uninterrupted “flow” time is required to complete a bunch of small things than to make one really big change. And others digest the smaller changes easier and knowledge spreads more effectively. And with the time its easy to say “don’t make people review outside working hours.”
They are not new challenges, they are the same old challenge that led to version control procedures and systems in the first place: your incomplete or broken code is interfering with my attempts to complete or fix mine!
Whether the broken merge happens in a branch or as part of a rebase is just moving things around. The trick is to enforce good code cleanliness when pushing, whenever that may be.
My team does stuff on mainline and requires a new review on rebase.
So since I was the one who brought it up there's like three parts to this in my mind.
Main caveat is that “prod” is a handful of central places.
So how does code review fit in?
- The first thing is that your main branch needs to be able to tolerate merging incomplete code without breakage. This would probably be resolved with feature toggles, ideally named to match issue/ticket numbers, so that you have to clean it up before closing the ticket. Automatic tests can catch syntax issues but code review would check that your code was indeed either removing a feature toggle (in which case the team should have agreed that the code is mature and turned it on in production) or properly isolated behind a toggle.
- The second thing is that the team needs to be working as a team. This means that the sprint goals are co-owned by multiple people, ideally the whole team takes responsibility to deliver. It never made much sense to me that we segmented out features to individual developers, as if “oh she is out sick this week, well she wasn't working on anything urgent, the work can wait for her,” I understand that this makes performance review slightly easier but I am not convinced by that argument... So what this means for development is that the “war room” that you see for a typical prod outage with a bunch of people chasing down different leads, I want to bring that mentality (without the concomitant stress!) to feature development. So this means that ideally the whole team knows what everyone is working on and code reviews are just part of the organic process of working on a feature together. «Hey you are calling this “accept_states” and I called it “success_states,” I really don't like the fact that you're using “accept” as an imperative verb here, but I could go with “states_to_accept” or “acceptable_states” or we could use “success_states,” which would you rather?» ... You read through each other's code because it legitimately impacts what you are doing elsewhere in the codebase.
- The final ingredient would be release staging. The way a lot of people do branching makes it really easy to forget commits or to roll something back and not rollback the rollback, et cetera. Because we merge everything into main, we get monotonic flow: a commit that is still relevant is merged into main first because everything is, then it gets cherry-picked into the appropriate release branches. (The only exceptions are bugfixing code that is no longer part of the latest version.) The exact tooling to make this doable seems tractable, but perhaps not easy. Per point 1, before the feature branch is permanently flipped, all the code related to it is marked as such and can be merged into a patch release without actually triggering a semver violation (just leave the toggle turned off!)... So in theory the actual feature release is just a single commit. I’d have to see how the rest of this works before I would know exactly how this plays out.
Note that just about every Git-based CI/CD platform does CI/CD incorrectly, things like ACLs and CI/CD config are central statements about who has access to what privileges and should not be branched. A Jenkins which pulls its config from the `ci` or `main` branches will usually save you a lot of headache.
"I would have done it from this other approach". I've seen that, and it's not good when you get the feeling of "code review is when someone who hasn't thought about your problem tells you how you should have solved it". People sometimes feel they have to add value as a reviewer, and casually discarding other people's work is the way to do it. Fortunately it's not something I have to deal with at my current job.
Another way to frame it is "code review is not the right place to validate design decisions".
This type of comment is the hardest to make for me as I know it means redoing a large part of the work from scratch. It would generally be avoided by having quick design reviews before starting to code something difficult or involving sweeping changes. Just highlight how you are going to do it in a few sentences.
On the other hand, letting these kind of things go through usually means you will have to deal with the outcome later, and it will be more painful.
Maybe one way to decide about making this call or not is if you find the submitted version bad enough you are ready to implement the alternative yourself, immediately.
And it happens. With junior devs, or because people are new to the codebase.
> Another way to frame it is "code review is not the right place to validate design decisions".
That's a good way to put it. I wonder if draft PRs/MRs could be used for that? Make a draft PR/MR with the design, do a design review, then implement the code and finally the code review.
I’ve made comments that start like that and it’s usually down to
1) obvious code smell, here’s an example using your existing code refactored and the reasons why it is a better fit here
2) you’ve done something I didn’t think of and it’s clearly better than the way I was thinking of it. Here’s why it’s better. Kudos!
Helps that I’m lead/principal in a small biz with like 4 people max writing code. So I kind of know what everyone is working on / what they’re touching with their changes.
Mileage will definitely vary in bigger teams / businesses. 100% helps that my performance isn’t tied to number of PRs reviewed etc.
> While I have never been at a place that did this, I have in my head the idea that the code should be an unfolding collective conversation, kind of like when folks are all collaborating on a shared Google Doc, I see that you are editing this section and I throw in a quick comment “don't forget to add XYZ” and then jump to a different part that I won't be stepping on their toes with.
You just discovered pair programming.
It works astonishingly well. The second pair of eyes not only catches errors and envisions expanded use cases, it also prevents you from shirking off to HN. The biggest problem is you need two people, preferably sitting right next to each other with just one person “driving”.
With the right pair it is unbelievable how much productivity can be unleashed. I concur, it's much easier to stay focused on the task at hand with someone actively working on the same thing.
The reason why this didn't catch on is that it's almost like a Mick and Keith sort of relationship. You can't just take any two musicians and throw them together and get the Rolling Stones and the same thing applies to pair programming.
Take a look at Bell Labs during the birth of UNIX for an entire SWARM of interdependent engineers. There's more than just a small element of good fortune! It seems those people were almost meant for one another.
> You can't just take any two musicians and throw them together and get the Rolling Stones and the same thing applies to pair programming.
I don't see that effective teamwork should necessarily depend on compatible personalities. I think the better comparison would be to surgery teams, or pilot crews, rather than art.
It works well if the tasks are short and well-described. Long, complicated tasks descend into odysseys where one person zones out or gets completely lost as the other person just ends up treating them as a clumsy proxy for the IDE. Furthermore, it rarely works online.
That is my experience as well. Another case where the "clumsy proxy" issue happens is when I help a frontend dev debug something with the backend, especially online. Do collaborative IDEs exist? How usable are they? A collaborative terminal might be useful too (for when I want to run a few commands quickly to check something on my colleague's computer). Thinking about what I just wrote, maybe I could ssh into their machine?
People doubtlessly do it. I sometimes do it when I tutor somebody on a server we both access via SSH. If you use such a server as a jump server, you can setup SSH tunnels to get all the way to another desktop.
I think it would be wonderful to install a remote IDE with workspace on a remote server. Or to VNC/RDP into one.
I'm a principal so I rarely write code now. I probably do 6 PRs a year, basically when its directly related to my expertise and it makes more sense for me to do it. But when I do, I really try to guide the reviewer into my thought process. In your "centralize cache invalidation", I would have in the PR (and probably as comments in the code too) "I moved the cache invalidation to this function to centralizes the logic, it avoids problems X, Y, and Z which we've seen before".
I find that giving a good narrative gets the reviewer thinking like you or at least tells them why you chose one path and not another so that they don't waste time with a "why didn't you do it this way" question.
I keep trying to train my team do to that, but they're so focused on completing tickets they don't want to take the extra time to explain their "whys" and thought processes.
> Suddenly someone is like “I would have done it from this other approach” and you have no idea whether it's tractable
I find myself on the other side of this, all the time.
Suddenly someone comes up with some work that they have done without involving anyone else. I know this codebase, and I know moving that cache invalidation will make the codebase harder to understand, and also runs counter to the multi month effort you had to move all cache handling to the shared library which every other codebase in the product uses.
I try to be very careful about it, "have you considered using this other approach instead?" usually means unless you do this there will be the need for an even bigger rewrite in the future, but nobody can really take comfort in that.
The answer is that post facto code review is really unsuited for collaborative development. You start some work, you better involve other people straight away. Bounce ideas with a partner or two that really knows the codebase so you both understand what should be done and why. Unless you agree what it is your suggested change should achieve, there can be no useful code review!
When you are very a tiny team this mostly follows naturally, but always gets lost when the team grows and are split in several, and when individual developers starts to lose track of the codebase as a whole.
Those are good things to consider in review, but I maintain that the answer might be "no" to one or more of those questions and still be acceptable.
I'm old enough to have worked in the pre-code-review era. Things were fine. People still learned from each other, software could still be great or terrible, etc. It wasn't appreciably worse or better than things are today.
> An implicit question in several of the above is "will this set a good example for future contributions?"
Which in my experience can be an almost circular requirement. What do you consider a good example? As perfect as perfect can be? Rapid development? Extreme pragmatism?
The more experienced I get, the less I complain about in code review, especially when reviewing for a more junior dev, and especially for frequent comitters. People can only get so much out of any single code review, and any single commit can only do so much damage.
Put another way, code review is also about a level of trust. Will the committer be around next week? Are they on the same team as me? If yes, give them some leeway to commit incremental work and make improvements later. Not all incremental work need occur pre-commit. Mention areas for improvement, sure, but don't go overboard as a gatekeeper.
Things are obviously going to be different when reviewing code from what amounts to a stranger on a mission critical piece of code, etc.
> Put another way, code review is also about a level of trust. Will the committer be around next week? Are they on the same team as me? If yes, give them some leeway to commit incremental work and make improvements later. Not all incremental work need occur pre-commit. Mention areas for improvement, sure, but don't go overboard as a gatekeeper.
I think this is very important, especially the part about incremental improvement. too many see development as laying concrete where it has to be perfect rather than as an ongoing process.
and personally the only thing I find PR's good for is ensuring jackasses aren't doing stupid shit. And by stupid shit here I mean things like using floats for currency (I caught that w/i the last year), things of that nature.
But my preference is to work with people I can trust and at that point I don't give a crap about a PR or a code review.
As a counterpoint, individual code ownership can be a fantastic model too. It lets engineers specialize and takes advantage of social systems that form naturally anyways. I’ve not personally seen group ownership work well, and in practice, it’s still an individual who knows a given area the best.
I am now working in an organization that is set up to reduce code ownership, and they struggle to attract talent, although pay is good and work is fulfilling.
How do they do reduce individual code ownership? Horizontal integration. Developers code, analysts design DB structures (at least nominally), project managers set up meetings. Different silos exist for CICD, cloud roles, core teams.
There are vetting committees everywhere that have the last say on the libraries used and the nitty-gritty details of REST APIs and naming.
It exhilarating to start projects, then see them degrade inevitably into corporate monstrosities.
You assume projects are fulfilling because they have "public utility" (whatever the fuck that means). That's your assumption, not objective fact.
I would not work on boring ass project with cripping management problems and think that's fulfilling. I assume you have recruiting problems because most people share that sentiment.
People like to work on interesting stuff without much office politics, that is "fulfilling" to them, even if outcome is app that would be "boring" to the outsider.
I feel the post you are replying to is really advocating for shared knowledge, which is entirely compatible with the positive aspects of ownership. Personally, I have seen nothing but trouble from people whose distorted concept of ownership opposes sharing knowledge of "their" stuff.
I am ok with code "ownership" but if someone other than me can just outright reject pull merge requests without explaining why then I should not have to read this code or make changes. If you own it, you fix it. What needs to fix? Figure it out. Don't ask me. You are the owner.
Well put. I'd like to add that individual code ownership must not necessary be lived as a loner sitting on that code. Being an owner means hatching, keeping clean and also knowing when getting help is beneficial. Accepting your bias and asking for a look from a different perspective is not getting rid of or dispersing ownership in my view, it is a central part of it.
Often this includes getting feedback if that change can make problems outside the feature implemented or in the future, like introducing dependencies that cost more in total than they help locally.
This is an ideal (if not utopian) vision of software review (though please let's handle style checking automatically already!) It implicitly requires, however, such a thorough examination of the code that diffs would seem to be an irrelevant distraction.
In practice, however, starting from the changes in a system that was previously working well enough is a very effective way of focusing limited human attention on where the problems are likely to be, optimizing for error detection with limited resources over the broader knowledge-dissemination goals espoused in this approach. If we are going to use diffs, then this brings us back around to the topic if the article.
Personally, I find having any form of diff embedded in what I am trying to understand just makes it harder to follow, so I move the diffs onto a secondary screen and use it as a guide and reference to what has been changed. The author of the article seems to want the same, but the mock-up is somewhat misleading, as it only has substitutions and additions. Where something is removed or refactored to another place, the two views will no longer line up as depicted here.
Code style check should be automated. I do not accept discussions on style in PR in my team besides pointing out obvious deviations which should be automated. Discussion on that should be way before making PR.
Architectural changes/discussion should be discussed by developers way before PR on slack or in a call. Most features should not change architecture and team should make effort to align architecture all the time at least before someone makes PR. Unless of course PR is PoC to showcase approach.
In response to your first point I guess things really depend on where you store PR metadata and how ephemeral/permanent it is. Some teams store that information in the ticket, some fill implementation notes with change request, some add that to the PR, some discuss in their standup (or similar) meeting.
Regarding 2-3, you are right, I just lumped them all under umbrella term.
Maybe we could use terms like "PR review" and "code review". The former is a shallow LGTM check, while the latter may involve code checkout, poking around, architectural discussions, pair/team programming and so on. In my book they are entirely different beasts and web tools are geared (not without justification) towards the former, where both types of diff should serve the purpose.
To me this happens naturally, Looking at the diff first, if I know the codebase well might be enough to get a sense of the changes, or at least understand their isolation. However, as soon as I start to feel that I don't fully understand the implications, I quickly check the branch out and poke in an editor.
OP is right though, if the "check out in editor" workflow was much smoother (than quick web view) I would prefer to always do that
This is why you should insist on a changes being reviewed by at least two pairs of eyes. Someone responsible for that area of the code might raise those kinds of objections. Someone who is more of an outsider can focus on other problems.
E.g. I'm kind of a human compiler; I can say things like, you have undefined behavior here, in a completely unfamiliar code base where the maintainer of that might miss the issue, too busy pointing out that it doesn't respect architectural conventions, and is not tested.
In my book, superficial code review produces only superficial results. This is how so many bugs get shipped despite the ubiquity of code review.
Call me crazy, but I've always approached code review from the standpoint of a tester: I build and run the code, see whether it does what it's supposed to do, look for possible weak points, and try to break it.
The diffs are the beginning rather than the end. They show you where to start poking around in the code. The proof is in the pudding, though.
You might respond, "That's what testers are for, not engineers." But who can test better than someone who understands the code and knows exactly where to look and what to look for?
I know this attitude puts me in the minority. The majority of engineers seem to have an inborn horror at the prospect of, gasp, manually running code. (They're also strangely averse to using debuggers.) They'll do anything in the world and write vast infrastructure to avoid it. Automate all the things! But in my view, automation is just more possibly buggy code. Who tests the tests? Quis custodiet ipsos custodes?
Automated tests aren't so much to make sure that the test you just wrote works. You are right that manual testing can do that better sometimes.
Automated tests are so that the code you just implemented still works next months, when lots of other people have made unrelated changes, and weren't always aware of the interactions with other parts of the code.
The automation is important, so that the tests get run, even when people forget or are under time pressure.
To be clear, I'm not against automation. What we do for a living is write code, so when I say "automation is just more possibly buggy code", I'm not opposed writing more code (though I am opposed to overengineering). My point is that I don't place all of my faith in automated tests, and I'm not opposed to rolling up my sleeves, getting my hands dirty, and doing manual labor. Manual labor can be tedious, though, and that's probably why a lot of engineers hate it.
The problem starts when people do neither: only superficial test suites, not running a few test cases locally, and only start using the debugger when they need help from other people with neither the time nor patience to just stare at the code and argue about things without actually trying to get answers. For example, by provoking the problem and turning on the freaking debugger.
I fear half of my team can't run the application locally. To be honest, there are lots of moving parts in the codebase, but it's not only that.
Apropos "automation is just more possibly buggy code" is also a good argument for static typing. Good static typing can express (some of) your business logic, so that (some) invalid states are forbidden by the types.
The result is that quite a few invariants that need to be checked via tests in eg Python can be expressed in the type system in eg Haskell. The type checker already exists and is implemented as fairly battle hardened code, so this way you reduce the amount of new, possibly buggy code.
An example: in Go the convention is that when a function can go wrong, it returns a tuple of two values and exactly one of them is supposed to be the null pointer. But the compiler doesn't help you enforce that convention at all, so you need testing (and perhaps careful reasoning).
By contrast, Rust's Result type enlists the compiler to make sure that you return exactly either an error or a business-as-usual value.
I agree about what to expect from a code review - but for me, certain ways of working need to be in place for this to function properly.
A code review is the "last line of defense" (well, disregarding CI), but in the teams I've worked in, that meant that the general idea of what the PR is introducing has been discussed by multiple people at that point. (This could be either a pairing session, an in-depth explanation, or just a coffee chat, depending on the complexity) This way the PR isn't about reviewing the general strategy (splitting off a new module, introducing a huge dependency, reworking the API surface), but just about reviewing the tactics used to implement that strategy.
Without the communication beforehand, doing only sanity checks on parts of the code in isolation does run the risk of fracturing code ownership. ("What's that module doing?" "Beats me, ask bob")
That all notwithstanding, I like the author's idea of what a diff view should look like, regardless of the "mode" of reviewing.
I think you have some good truths in your post, but I also think they're rendered somewhat irrelevant by the fact that a "diff review" is, in my repeated and frequent experience, simply incomprehensible. I can't do any review with an incomprehensible mass of red and green lines interspersed with each other. It's not a useful view of the code, or at least not a useful primary view.
I push through and do it anyhow, because this is one of those "they're not paying me to have an endless party" sort of things. But I completely 100% agree with the author that it is not a very useful view.
Most of modern technology company culture has strayed too far from doing more formal processes for managing development.
By not having a basic process for defects, features, etc, every engineer has to make up and invent the process for how they will ship their feature through collaboration.
In most cases this ends up with doing little to nothing, with code review being the only form of collaboration on the topic that is in any meaningful detail. So we're left trying to stuff all possible interventions into a single moment.
> unless the person doing the review is some sort of principal engineer mostly responsible for the code at large [..] In my book a general code review is simple sanity check by a second pair of eyes
I don't necessarily disagree with you, but I don't agree either.
A typical workflow should be:
1. Engineer writes code
2. Engineer does manual and automated testing to verify things work correctly
3. Engineer commits, pushes, and creates a PR/MR
4. CI runs test suite
5. Another Engineer reviews PR/MR
6. Approval causes merge which causes deployment to staging
7. QA
8. UAT
9. Repeat steps 1-8 until everybody is happy
In this, we need to place checks on #1 and #2.
In my eyes, QA and UAT are superficial reviews/sanity checks on #2.
#3 is a check in #1. However, it relies on A) previous engineers implementing good tests and B) the current engineer doing the same. This means trusting people are doing the right things, but a Review implies you don't completely trust that.
#5 is a check on #3 and #1. They're essential. Having a full understanding of what the code looks like is more important in my mind than what was actually changed. The changes don't show how other code is interacting with those changes (which should be, but isn't always, captured by unit tests).
That's why I think the article author's preferred view is ideal. In fact, that's roughly what I work with when reviewing without realizing it (diff in GitLab/GitHub/Sublime Merge, code in JetBrains where most of the review happens).
For most changes I want more context than -c provides, because I see that your change is syntactically correct, but semantically, was your change from < to <= appropriate? I need to see the definition of the things you're comparing to.
Often -c is enough, but not always. I don't believe you can make a hard and fast rule. I think all you can really do is have a diff that allows you to 'jump to definition' or 'list callers', just like you can when looking at the code itself.
A third (fourth?) option worth mentioning here is difftastic[0], which uses "structural" diffing (as opposed to line diffing) for more granular diff highlighting.
A fourth (fifth?) option worth mentioning is patdiff: https://opensource.janestreet.com/patdiff/ From what I remember, it sometimes (35%) made diffs easier to read, usually (60%) made no difference, and rarely (5%) made them harder to read. I used it a few years ago though, so I don't remember specifically what the problem was. The only reason I stopped using it was because I started using magit for git diffs.
I see that patdiff also provides word-level diffing. You could instead get that feature with `git diff --word-diff` or Delta (https://github.com/dandavison/delta).
This is a great approach indeed, pity it's not integrated widely enough (think it only recently got a proper structured json output other tools could use)
Git doesn't store diffs on logical level. Git operates on snapshots of trees. Commit is not "a collection of changes", it's a snapshot of a tree with attached predecessor of it.
Then the another layer (which can be git, but also can be any other tool, adding custom diff tool to git is very easy) uses that to generate diffs.
There is zero stopping anyone from adding contextual diffs to Git. Just ask it for content of both commits and feed it to the algorithm.
Yes, git underneath stores data as diffs but they are only vaguely related to logical structure of commits
Git’s merge algorithm looks at three versions of the code: the two branch tips being merged, and their common ancestor. It might be better at resolving conflicts if it looked at some of the intermediate commits as well, but I don’t know of anything that does so.
Merge part isn't pluggable like that IIRC. Would be interesting if that was given stable interface, then supposed "smart merge" tool could iterate with few ways to merge code while running tests to check which one produces least/no errors
* A little scripting around opening the PR, which basically performs a "vimdiff <(git show baseref:file) file"-style dance on the changes(see :h diff). Using vim's tabs is great for this as they're really only views, so you can hold individual buffers open in distinct states at the same time.
* Scroll locking still works as expected in the main view, but you can avoid it in a separate tab when needed.
* [c and ]c move between hunks from the set of changes as they exist in the PR not in the working directory.
* dp and dg allow you to mark hunks as "done" by pushing/pulling the hunk in to the read-only diff buffer so that they're now hidden from the highlighted changes in the live buffer.
* Changes you make in the live buffer are available to commit directly, or push as a comment.
* All your regular editor things work as expected in the current state of the tree; go to definition, build integration, popup docs, etc.
Working like this means you're viewing changes against the PR's base, but have a clean working directory. That, to me, feels like a significant improvement over matklad's solution of having the working directory be in an unclean state to view the changes.
The environment I work in makes this behaviour super nice as changes will often be added with a --fixup commit, and then the tooling mangles them back together with a git-interpret-trailers call to attribute the fixup commit's author to the original commit at merge time. It also pulls text comments out of the PR and attaches a Reviewed-by trailer where appropriate, or the +1 equivalent to tack an Acked-by trailer on.
Oops, poorly worded and using internal terminology. I'll try again ;)
We use some internal tooling to make things work, but the concepts are generic git and $forge.
Push a comment: I'll make a hand-wavy suggested edit, then a vim mapping basically performs a ":`<,`>w !curl". If you only used GitHub, then piping to something like "gh pr comment"¹ could perform a similar-ish role(albeit a little weaker).
Pull comments: We automate merges so that PR text(including replies) are available in the repo. For the general discussion they're attached as notes², and for code comments(including --fixup commits) they're processed to assign the correct attribution via trailers³ to the original commit at merge time. Most of the attribution functionality could be re-implemented with "git rebase --autosquash" and by providing a "git interpret-trailers" wrapper as $EDITOR.
A somewhat minor nitpick, the word huge begins with a consonant sound, not a vowel sound. It would be correct to write "a huge issue" not "an huge issue".
I'm probably missing something here: author says that the split diff doesn't work for him, but doesn't say why.
His ideal diff is pretty much the same as a split-diff but with redundant context removed (context is only on the left, not on both left and right). What utility does he get out of removing redundant context on the RHS of the pane?
> I need to run tests, use goto definition and other editor navigation features, apply local changes to check if some things could have been written differently, look at the wider context to notice things that should have been changed, and in general notice anything that might be not quite right with the codebase, irrespective of the historical path to the current state of the code.
The editors/IDEs I've used usually only offer rudimentary support in diffs (goto defn, but only in the same file; maybe auto-completion, but usually not for newly added items; no refactoring functionality)
> The editors/IDEs I've used usually only offer rudimentary support in diffs (goto defn, but only in the same file; maybe auto-completion, but usually not for newly added items; no refactoring functionality)
Sure, but (to me it seems that) that doesn't necessitate the display change he is advocating for.
A diff program that has nice code navigation (and/or other IDE features) would be great, but what does that have to do with whether it is displaying the common split-diff, or his take on the split-diff.
smerge-refine is one keybinding I always use to highlight changes in 3-way diffs between base/mine, base/theirs, mine/theirs. The only problem is knowing what mine and theirs mean sometimes.
I'm early in my emacs journey, but used VSCode for years. The Github extension is absolutely amazing for pull requests, since you have all your LSP and regex code navigation goodness. I only used it for a small percentage of reviews, but it made it vastly easier to grok the impact of every change.
LHS shows current code, without any modifications. By showing full context in one view (with code browsing capabilities!) and changes in the other, the author is able to browse the code and see what changes, if any, are applied in a particular context.
In my understanding the author wants to flip the diff around: instead of looking at changes themselves, the author wants to look at code and see if there are any associated changes.
The article is light on detail, but my guess would be that author wants to look at code and browse to implementation/callsite to check if appropriate changes are there.
> LHS shows current code, without any modifications. By showing full context in one view (with code browsing capabilities!) and changes in the other, the author is able to browse the code and see what changes, if any, are applied in a particular context.
That's the bit I don't understand - how does the "changes in the other" help if displayed only as the unified snippet[1]? To my mind, the magic sauce bit is the full code navigation, and there's no reason that the RHS pane has to be limited to a unified diff when it can simply show the whole file with higlighted lines, the way split diff views do on the RHS.
[1] Perhaps (and I'm only guessing here), that the RHS must show the entire unified diff for the entire changeset, and not just the unified diff for the current file. To me, that makes the proposal an upgrade from "either show unified diff, or split-diff, file-by-file".
I'm not the author, but I can empathise with them.
The diff highlighting can be surprisingly distracting and it can definitely help readability to turn it off. I certainly do turn it off from time to time in my diff viewer so I can see the code with fresh eyes, but I don't have the option to show the unified diff in a split pane.
If you hit `.` in GitHub, you'll get dropped into a full IDE inside the browser. I've found this to be invaluable for reviews, because it lets you see the changes within the context of the entire file, rather than just seeing snippets. I'm much more likely to catch subtle design issues that way.
ok this is really cool, thanks for the tip! I've been in this IDE before but only when making edits to files directly in Github. I did not know I could shortcut to it directly from a commit diff.
By the way, this shortcut also works in Gitlab (just tried it)
I miss some of the capabilities from p4merge such as the split diff having a margin inbetween showing which parts of the file correspond to the other file, whilst also showing the impact of the diff on both sides: https://www.perforce.com/manuals/p4merge/Content/P4Merge/dif...
these features never made it into the web based diff tools that are widespread, I think the 3 way diff is a be a good way to show result of a difficult merge
Which is a travesty because a side by side diff view with an out-of-line list of PR comments could be incredibly useful if it didn't involve spinning up a whole VSCode instance in the browser. In fact it's so useful that GH used to have a split view available without forcing people into their buzzword AI ML crypto blockchain cloud enabled dev environment nonsense.
Good idea, abysmal execution (which pretty much sums up all of GH these days I suppose).
GH still has a split view available. Above the diff, you have a settings icon which shows dropdown where you can changed between unified and split diff.
Well that's certainly more intuitive than the old split / unified button. Unfortunately that basically revives the old split diff view and doesn't provide the useful out-of-band comments. Basically from a quick glance the buzzword view has a lot of good UI ideas that should have made it into the normal views as they have nothing to with actually running in a buzzword environment.
FWIW I'm benchmarking this against a PR where github is showing ~43,000 lines changed so things that are manageable in smaller changesets don't always scale.
Edit: I should also add that while the split view does exist in the non-buzzword environment, and there's even a (ugh) combo box to allow you to navigate to the inline comments, the split view actually hides the comments so they're completely inaccessible. Using the combo box does… absolutely nothing.
Decent ideas, atrocious execution as is tradition.
I admit I don't understand some of the problems mentioned in the article.
E.g. that script that squashes the PR commits together - why would anyone need this, what is wrong with diffing the PR branch and the target branch, using any tool you want?
What forces you to look at the commit history?
I'm perfectly fine with the git CLI and IntellJ for local review work, and Gitlab web UI for the communicative part of a code review.
Only thing I agree with is, I (sometimes) hate the three-way unified diff in IntelliJ for merge/rebase conflicts.
Other times (harmless conflicts), I love it over using the CLI.
Personally, I found the example for the desired diff format confusing.
> For a large change, I don’t want to do a “diff review”, I want to do a proper code review of a codebase at a particular instant in time, paying specific attention to the recently changed areas, but mostly just doing general review, as if I am writing the code. I need to run tests, use goto definition and other editor navigation features, apply local changes to check if some things could have been written differently, look at the wider context to notice things that should have been changed, and in general notice anything that might be not quite right with the codebase, irrespective of the historical path to the current state of the code.
This is spot on. In fact, I think modern code review practices over emphasize the historical path to the code at the expensive of lost quality of the present code and code architecture.
"Minimizing diffs" is a feature of modern code and PR practices, whether it's explicit or tacitly something the developers do. Optimizing for minimizing diffs discourages the continuous refactoring that code bases require to stay solid, sound, and visibly correct.
I put GPT and Llama 70B to write me comments about the Rust language source code. They are both superb at this task. I haven't used any other weaker LLMs in the same fashion, with less billions of parameters, but i can imagine many of them, in the range of 7B to 13B, will perform good enough. I will test more LLMs soon though.
Maybe code reviews on the raw source code, does not need to be confusing anymore. Code reviews on the description of the source code is better.
I tried to prompt the LLM to just summarize the code, and i didn't like the result. The description is indeed verbose and somewhat inefficient.
I tried "write some comments about the source code, in the style of codinghorror" and the results were fantastic.
I am very interested, if someone has found some other styles that work just as well.
Edit: All this to say, that in the space of LLMs and source code, there is a start-up which will be a github disruptor, and a new era of code will begin.
When working on a PR myself, I frequently avoid doing small, incremental commits because I find the subtle "these lines changed" annotations in IDEs extremely useful. It helps me find the locations in code that are relevant to my work.
I wish there was a way to configure e.g., IntelliJ to always show these markers relative to `main` instead of the last commit.
Even though I'm a KDE fan, I usually use Meld for diffing, especially since I've set it up with a lot of custom filters for certain cases. However, when I want to compare 3 different files, kdiff3 is my go-to tool.
looks like the code review is happening too late. here the post’s author is also trying—hard—to reconstruct the mental states and models of the code author, especially in the large/significant update case. i argue that this workflow is a relic of the past, and should be replaced by a two-phase flow where in the beginning the essential ideas of the new changes are proposed, discussed, approved. then lines of code may be added or removed. the update is semantically grouped (perhaps via a commit) to correspond to key decisions made during phase one.
that way the problem of encountering new ideas through the darker medium of code and struggling to comprehend is solved, since now the ideas are presented and evaluated in a human language. the subsequent code review confirms that the implementation adheres to the gaveled proposal, and this can be easily done, even by a junior developer.
This might be a useage thing but author seems to already point out that for a large code review, these two options aren't sufficient, but I've never seen these tools offer themselves as a solution to a large review.
They're a diff tool. The author's approach is creative but seems like it's the wrong tool for the job.
I do the same! I checkout the branch of the PR then do a mixed reset to origin/main. This way, the git status shows all the changes. It’s really neat indeed.
It's interesting to me that the industry still relies so heavily on diffs of code listings as a primary method for code reviews.
It is absolutely a good use of a human reviewer's time to build a mental model of the code's runtime behavior. But to do that by manually by reading each line and trying to predict what will happen when it's run is massively inefficient, incomplete, and error-prone. Plus it's susceptible to the "LGTM, fine, just merge it" phenomenon when the PR is large.
Reviews of static code listings won't reveal how ORMs structure their DB queries at runtime, or reflect how dynamically injected/configured components will behave, or any other number of things that are only visible by watching the code execute.
We have commoditized linting, checking for CVEs in dependencies, and static analysis for certain classes of bugs. We should now use fast runtime analysis in the same flow to relieve the burden from human reviewers of having to do line-by-line "telepathy reviews" where they try to magically divine how something will run in production at scale. (Full disclosure, I work at a company doing exactly that - https://appmap.io - and one of our most popular features is our sequence diagram diff that shows runtime differences between a PR and the main branch).
In my experience unified diff is good for small changes. Split diff like meld is good for many changes in a long file. Many diffs in a long long file, you should not have such a file.
For large PRs with many files the problem is not so big because they are the sum of many small changes, file by file. Maybe a team should aim at small PRs but sometimes having to change X into Y everywhere, with a large X or Y, is an inescapable fact of life.
> The only thing I don’t get is automatic synchronization between magit status buffer, and the file that’s currently open in the editor. That is, to view the current file and the diff on the side, I have to manually open the diff and scroll it to the point I am currently looking at.
Probably not a perfect solution but `scroll-all-mode` should be pretty close, at least within a single file at a time.
> I need to run tests, use goto definition and other editor navigation features, apply local changes to check if some things could have been written differently, look at the wider context to notice things that should have been changed, and in general notice anything that might be not quite right with the codebase
It sounds like the author really wants pair programming
There are certainly situations where, as a reviewer, 5 minutes of goto-definition on a PR branch, or a well-placed debug breakpoint while running a test, is all you need to write a more prescriptive comment than “I don’t understand how this data flows” - and thus avoid both a follow up meeting and the need to pair program in the first place.
Pairing absolutely has its place, but it’s not always the most optimal use of time.
In a code review, the code on the right is where the focus is: is that correct?
If that change is merged, the right side version is what the code will be; the left side becomes a historic artifact indicating what the code was.
I glance on the left to understand what is changing: are some aspects changing that are not intended, and such.
There arise situations when a diff is total garbage, because code has moved around while being changed and whatnot. Sometimes unrelated code is diffed together. In the split diff you can still see the new code how it should be, but it's hard to track the changes.
In git, you can influence the diff algorithm to get a different diff, e.g. "git diff --diff-algorithm=minimal", documented as "spend extra time to make sure the smallest possible diff is produced.". This might be similar to GNU diff's --minimal option.
I’m surprised how we’re all still using line based diffs and even seemingly stupid ones at that. But the tales I hear from across the pond of paid git alternatives which diff based on understanding the programming language seem to be pretty bad as well. Though for other reasons?
The way I look at it, I doubt there will be widespread use of structural diffing while code editing itself is based on plain text files consisting of lines and characters.
Structured diffing requires structural understanding of the code. But only tool that is capable to properly understanding the code is compiler you are using to compiler, and even that only if you don't have mistakes.
Version control and thus diffing needs to be robust. Knowing how many times I have seen intelisense (or similar IDE) fail due to various reasons, i wouldn't want similar when doing diff. You don't want it to fail just because you used a new language feature which isn't supported by diff tool yet. I have seen failures even in syntax highlighters which only need a very simplified understanding of code.
I could see moving away from line based diffs if we switched to structural code editing and source code was saved as some kind of AST instead of plaintext, and editors enforced that only valid source code(at least in structural sense) can be saved.
There can be middleground like what the difftastic does with fallback to text based diffing. But I consider that as nice to have but optional functionality. Seeing how many developers have strong attachment, to specific tools and their existing workflow, I am not surprised about lack of adoption. Developers might not even be against those specific improvements, but the adoption can be easily blocked by lack of integration, lack of some unrelated functionality in the software that does have the integration, setting up process feeling like more effort than benefits of slightly better diff. I have also heard plenty of times the argument of "If you need fancy tools for code to be readable/workable then the problem is in your code not the tools". While there is some grain of truth in it, that doesn't mean you can't use better tools while still writing code which could be understood without them.
Line based diff is in the good enough territory. There are cases where structural diff can do a lot more but with good fraction actual code edits the output of structural diff will be the same as line based diff+smarter diff viewers which highlight the parts of line that were modified (instead of whole line) and option to hide whitespace changes.
It might be because of formatting, but that is exactly what I would have expected them to handle well.
Anyway, I have used a FOSS that does that, difftastic [1], and it does a pretty good job at language diff'ing without the annoyance of formatting as I hypothesised earlier.
Why would you want to diff code and ignore formatting? If it's just whitespace changes, any decent diff program should have an option to ignore that, but if it's more substantial, I would want to see that in a PR.
If, for some odd reason, I wanted to compare two codebases that had diverged, and one had gotten some serious formatting changes (maybe the new person preferred a different coding style), I'd run them both through an automatic formatter program with the same options. For C code, "indent" is commonly available, for instance.
I haven't thought about this in detail much before but honestly that "view" the author proposed does look pretty nice.
Though I think that something like difftastic is still a bit better, though I haven't properly integrated it into my workflow.
I'm amazed that default diff tools that come with git CLI or even git UI clients are so freakin' the opposite of intelligent. Even github or Atlassian web interfaces. No matter unified or split.
Change some spacing and the whole line is marked. Move an if statement 10 lines up and you see complete blocks of changes. Not even speaking about changed order of function implementations.
This seems a not-so-hard problem solved already a-million-times and I always fall back to use a local UI and configure it to use a commercial differ/merger I once bought a license for. Can't do that for code change reviews in the browser though.
"everyone seems to be happy reviewing diffs, rather than the actual code?"
The split view does show the actual new code, just with extra empty lines... so sure it is a bit distracting if a lot of lines have been modified, but it rarely comes that far.
My beef is more with some team's habit of doing too much work in a single pull request. Large diff are just a symptoms of that. I find that it is much more likely to introduce bugs and hard-to-find changes in behaviour in large changes. If you later find that something broke, finding what caused it in a large PR can be a real time waster.
In contemporary software "engineering" culture, this points at a much deeper issue than the scope of diffs that get presented in review tools.
Split diffs align well with modern "papier mache" development, where the goal of a task is to paste the smallest possible change onto whatever existing structure. Participants don't need to understand the whole structure and are expected not to alter it. Module refactoring is strongly discouraged and postponed until there's no other way to proceed on a critical feature. Designing (or refactoring) with an eye for future tasks is considered pointless because nobody understands which JIRA tickets are likely to survive and which will get purged or indefinitely backlogged.
When all you're supposed to be doing is overseeing Copilot as it drafts a new call to your upstream service and adds a perfunctory test that never fails, a split diff does a perfectly fine job of reviewing that work.
Whether this workflow represents durable, quality engineering or is just a way to LARP Katamari Damacy and get paid for it is another matter. As the enshittening continues year after year, it sure tends to look like the latter.
This essay is good food for thought, but it's just a peak into the dark forest that "move fast and break things" has been leading us into.
I do all my large code reviews in IntelliJ. Seems like it's an exact match of what this guy wants? Diff view + full view + fully featured IDE for all his navigation, verification and search needs
What the author wants (and I do too) is a normal, full fledged editor that highlights the code under review. Bonus points for easy commenting.
You can sort of do it with IntelliJ with git blame annotations but it’s clunky. I don’t really care about the author’s commits; I care about the entire set of changes or the changes since my last review.
You can also sort of do it with IntelliJ’s pull request mode but it’s also clunky since it’s not a “real” editor and you lose highlighting if you jump to the source code.
The author complains that he doesn't see all of the code, but in the first image it looks like the right-hand pane shows exactly that (though with diff highlighting). Am I missing something?
agree, I use side-by-side diff tools and the right side is where "the code" actually is. dont really get what issue the author has with this unless it's that the highlighting distracts them.
I feel like I used to use a tool called Beyond Compare that presented _many_ different views of a diff (and even supported N-way diffs) back around 2012.
It’s still around and has hardly changed one pixel since 2012!
Yet it still feels fresh and there is no competing product that can challenge them. Most other diff tools lack good merge capabilities and are subpar in multitude of other ways.
P4Merge (https://www.perforce.com/products/helix-core-apps/merge-diff...) is a similarly powerful (and similarly ugly) tool for plain text diffs and three-way merges. It’s free, and though the 500 MB download includes Perforce, you can install P4Merge by itself. Unlike Beyond Compare, though, P4Merge cannot highlight syntax in the compared files.
What the author wants is git blame with highlights.
IntelliJ does this very good. On the margin to your left you can see the age of each line. Recent changes - the one you are reviewing - will be be bright white, while older proven code more faded. All inside the already powerful editor you need to navigate bigger pieces of code.
Except that will not show you removed lines. Forget it, just use the diff feature.
I quite like your diff visualisation idea, thanks for sharing. Since I am currently working on a git client (gitbutler), I'm gonna experiment a little bit to see how it feels in practice
Your examples don't explain how both side can possibly be aligned when the differences are large. If they don't align, then the two view are independent and it's hard to see the correspondence. If they do align then... I'm not sure what is the difference between your suggestion and split diff, except th eleft side has less highlights.
So, as I understand it, you are advocating split diff with SUBTLER grey highlights in the current version of the file?
Neat, I also far prefer using in-editor staging & commiting workflows which are excellent by necessity; history tools by comparison are weak unless I'm missing something. So this seems really useful.
I might reuse the pattern to step through tutorial-type developments directly in-editor for presentations and videos.
The only real solution to this is "avoid big diffs at all cost". This is why the industry as a whole has mostly dropped git flow and moved back to trunk based development.
obviously every team and ticket is different, but IMO unless the person doing the review is some sort of principal engineer mostly responsible for the code at large, this does not align with I would personally consider a code review. In my book a general code review is simple sanity check by a second pair of eyes, which can result in suggestions to use different API or use an API slightly differently. If commits in the PR are properly ordered/squashed it is relatively easy to review incremental changes in isolation anyway.
I guess the complaint author has is really about review types. I do not have terminology ready at hand, but there is run of the mill sanity check review and then there is deep, architectural feature premerge review. The author complains about the latter when most of the code reviews in web tools are of the former variety.