Number 6 (DRY), and number 9 (use libraries), both lead to problem 16 (leaking a...

kevinconaway · on Sept 14, 2018

> Personally, I follow a rule I read in a blog some time ago and don't try to de-duplicate code until I've copy and pasted it three times

Known as "Three Strikes and You Refactor"[0]:

- The first time you do something, you just do it.

- The second time you do something similar, you wince at the duplication, but you do the duplicate thing anyway.

- The third time you do something similar, you refactor.

[0] http://wiki.c2.com/?ThreeStrikesAndYouRefactor

cimmanom · on Sept 14, 2018

The problem with this is the cases when you don’t remember that you (or someone else) already used this same snippet twice. In large codebases with a lot of people working on them, this can expand to dozens of copies.

pc86 · on Sept 14, 2018

If you have dozens of instances of nearly identical code and not a single person knows of more than 2 instances of them, you've got much bigger problems.

kitd · on Sept 14, 2018

This would be true if everyone on the team was equally involved from the very start on a greenfield project, and they aren't all already fighting other forms of tech debt.

Sadly the real world isn't so accommodating.

falcolas · on Sept 14, 2018

So, instead of dozens of copies of a snippet (which can typically be identified with tooling), the repo ends up with half as many dozens of partial abstractions? I guess it could also end up with a Frankenstein's Monster of an abstraction with more arguments and conditionals than the original code.

Like walking through a doorway, descending into a function can make you lose the context surrounding that function, making it difficult to see what is actually common between the 2, 3, or 4 different invocations of that seemingly common code.

bunderbunder · on Sept 14, 2018

Typically, in the teams I've been on that did the best job of keeping the code clean, we'd handle this by just being good at code review: The person submitting the change might not remember any duplicates, but there's a good chance that one of the reviewers will, if that's one of the things they're watching for.

Alternatively, if you really want to be exacting about this, there are code analysis tools that will do it for you.

nkingsy · on Sept 14, 2018

Couldn’t agree more about the negative impact of this rule on a big team / large old codebase. The “copy once” rule pushes back against abstractions rather than the real problem of wrong abstractions, too big abstractions, or too complex interfaces to abstractions.

We instinctually abstract by fitting n use cases into 1 abstraction, when in fact we should be inverting the dependency graph and writing or reusing n abstractions for each use case.

zeveb · on Sept 14, 2018

I think that's okay, because those large, multi-person codebases are precisely the ones which pay a huge cost for premature abstraction.

Too much copy-pasta is a great reason to refactor.

chooseaname · on Sept 14, 2018

If your code base is this big and you're adding a new feature, you really should be doing an impact analysis prior to making any changes.

lackbeard · on Sept 15, 2018

Or worse, you need to change some behavior and it it only gets changed in one of those places...

rakoo · on Sept 14, 2018

That's why you have your code reviewed by other people in the team before it hits master.

bunderbunder · on Sept 14, 2018

I wouldn't automatically refactor on the 3rd time.

It's not enough for the three passages to happen to be identical at this moment in time. You've also go to be sure that, going forward, they will need to evolve identically and in lockstep.

falcolas · on Sept 14, 2018

I agree, nothing is a hard and fast rule. And sometimes, an attempt to generalize copied-but-customized code can result in much less readable code than just leaving the originals in place.

alecbenzer · on Sept 14, 2018

+1 on "premature generalization"

A similar issue I've seen a few times is people implementing an API and the use of that API in two separate changes, and as a result creating an over-generalized and harder-to-test API that tries to anticipate lots of use-cases, instead of a much simpler API that only exposes/tests what's actually needed.

lojack · on Sept 14, 2018

Sandi Metz did a great talk on this that can be summed up in one really good quote

> Code duplication is far cheaper than the wrong abstraction.

https://youtu.be/8bZh5LMaSmE

bryanlarsen · on Sept 14, 2018

"But when you use 3rd party libraries, you are still responsible for that code in the long run."

Sure, but you very rarely have to do it alone, and you'll only have to do it for a very small fraction of libraries.