Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Software Checklist (2014) (solipsys.co.uk)
92 points by ColinWright on Nov 24, 2018 | hide | past | favorite | 29 comments


A lot of problems in software stem from earlier on in the process than the testing and validation phase. They happen at the writing and design phase. So your checklist not only has to contain "check for buffer overflow by checking for a standardized pattern," it also has to have points like "make sure the code is maintainable" so the programmer won't introduce a stealthy memory allocation that won't get freed in the first place.

Of course, it's not so simple. How do you actually quantify maintainable software? Just for that one point, you could write an entire book. When you introduce other points, the process of writing quality software becomes exponentially more complex, with different aspects of well-structured designs overlapping, balancing with each other, and sometimes contradicting each other.

To bring it back to the original illustration, even if you have a flight checklist, it's not going to matter if your plane was poorly made in the garage of some beginner aviation hobbyist out of scrap metal. If your program is made of spaghetti code and poor abstractions, you're going to be working on that testing checklist for a long time.

I do agree that checklists are helpful and that more standardization would help, but we also need more quality training and certification, arguably more so. We need great aircraft designers, not just pilots.


I find the article a bit confused about the purpose of checklists. The checklists in the aviation industry are there for the technicians (including pilots), not the engineers. The article seems to advocate automated checklists for programmers (who really are engineers) and not end users or administrators (who are technicians).

I work on a 40 year old mainframe product and as a vendor, we provide a lot of checklists for our users (system administrators). Everything from installation and configuration to maintenance and customization is covered. This approach is an exception on the mainframe and it is virtually unknown in modern software applications. (We are not really mission-critical software, but we have a culture like that.)

Could some of our checklists be more automated? Yes. But I believe today there is this strong idea in the software world that you never ever need a technician (system administrator), because everything is so user friendly and automated and reliable. So you never need the checklist either. I think this is false and there would be a benefit from stronger culture of checklists (provided by the vendor for the end user) in software.


In a previous job I worked as QA supervisor for a medial robotic company. The job entailed maintaining a 1000 page spiral-bound testing manual which could be given to anyone from an intern to a senior programmer. A slightly cut-down version was shipped with every system to be used on-site by the installation technician and left with the customer to be used before any bug-report was allowed to be filed.

It was most definitely a checklist with the purpose of finding and verifying bugs.

But it was rarely used by the developers or engineers (that I know of) because of the implicit understanding that if something was in the checklist it was a) already a found bug and has been fixed or b) if a regression occurred, would be found by the army of QA testers.

So yea I agree, checklists are used in the wild but their usefulness to coders is limited. If indeed that was what the author was implying.


I have a checklist for running one of our regression tests. It's pretty robust now that four people (including myself) have gone through it a few times.

I've thought about automating the test, but ultimately decided against it. Part of the checklist is just making sure the tester has the proper logins to the proper servers [1]. Second, if anything breaks in the automated portion, you need to know how it works in order to fix it.

[1] yes, multiple servers. And it takes five hours to run. We've gone to length to simulate as much as possible the production environment.


The meta-problem of Heartbleed was caused by a dysfunctional and under-funded open source project of OpenSSL. The Linux Foundation's Core Infrastructure Initiative was founded in response to Heartbleed and eventually created the Best Practices BadgeApp as a checklist for better and more secure open source projects: https://bestpractices.coreinfrastructure.org/en

(Disclosure: I co-founded CII and the BadgeApp.)


I agree! I encourage anyone who's involved in an open source software (OSS) project to get a CII Best Practices Badge at: https://bestpractices.coreinfrastructure.org/ ; if you are using OSS, then you should prefer projects with a badge.

If you want to see a video that explains the badge, see "An Introduction to the Core Infrastructure Initiative (CII) Best Practices Badge" at https://www.youtube.com/watch?v=JMptmhV06j8

New projects are participating every day (see https://bestpractices.coreinfrastructure.org/en/project_stat... ). The badging application is itself open source software; its project site is: https://github.com/coreinfrastructure/best-practices-badge/

(Disclosure: I'm technical lead of the BadgeApp project. Hi Dan!)


I'm honestly not that sure that checklists are useful for writing software. It's a creative process where you are making the rules, not enforcing them. It's more like designing the plane, rather than flying it.

Now understanding how to write good software, especially good UI for human operators to work, is a hugely underrated skill. Many accidents (air crashes, nuclear accidents, etc) are caused by poor or confusing UI.

For operations, or devops, I think checklists are essential, and should be automated as scripts and dashboards. But always have the ability to do everything by hand if required, to cut out some step or modify something in a critical scenario (trust me, it always comes up, and when it does, you don't want to be shooting from the hip in a crisis).

I've specifically built jenkins operations boards with a bunch of buttons to run scripts and do operations, and it works great. You also get a built in log of who's doing what when, and what happened with it.

Also, just because I think it's interesting, and related to checklists, look at the Spanair 5022 flight disaster - https://www.youtube.com/watch?v=EruTu5O9LX8 . Basically they missed a step in their checklist to make sure their flaps were at the proper setting for take off, but being distracted and in an unusual situation, they didn't run through the checklist correctly. Anything with manual steps can fail, including checklists.

Preparing for these scenarios is actually I think the key here. Having the knowledge written down after you've designed a system to handle adverse conditions is just a no-brainer (although it seems like no one likes writing things down, I do!).


Fuzz testing is probably one of the most underused tactics, at least in my observations. It's easy to write a decently thorough fuzzer that will absolutely crush a wide class of bugs that you have.


Fuzzers are extremely underused. Even in large, mature projects, a fuzzer being part of the source tree is much more an exception than the rule. Once you've done it a few times, it takes only 5 minutes to write a basic fuzzer for a function that takes a binary blob or a string as an input, after which you can run it indefinitely. But chances are it will find a bug worthy of a fix within minutes.

Various "safe" languages are just as vulnerable to denial-of-service vulnerabilities as C/C++. Since I started fuzzing Go projects I've come across numerous parsers of untrusted data that flat-out crash if a slice is accessed out-of-bounds. Deep recursions (A() calls B() calls A() etc) can crash both Go and Rust. And in probably every general programming language you can implement logic that leads to excessive resource allocation or computation, and unintended infinite loops. These kinds of bugs can obviously be problematic for network-facing components. A royal use of runtime assertions in debug builds for arbitrary invariants/conditions combined with fuzzing usually uncovers many more bugs.


The big difference is that C and C++ usually don't crash, rather silently corrupt memory with all possible outcomes.

If it is a server process, it can even corrupt data for days until someone actually notices it, if ever.

Or they just crash in a totally unrelated memory segment a couple of hours later.

Crashing when the out-of-bounds takes place is a much better outcome.


Do you have an example of this? A tutorial or specific tool to get a taste of what’s involved?


Well, not anything I can easily share, but here's a simple-enough example. I was writing code to handle base-pair sequences (ACTG) in a very strange encoding, which naturally lead to tons of edge cases. I carefully wrote my code for appending two sequences together, but was not totally sure I got it right, but decided after writing a few unit tests that it was probably correct. Sure enough, a couple of months later, I received bug reports that, after debugging for a while, I pinned down to sequences not being appended properly.

I stared at the code for some time and couldn't see any mishandled edge cases. It was then I realized that I could automatically generate an arbitrary amount of unit tests, because it's simple enough to call strcat and encode that to get the correct result. So, that's what I did: generate two strings of ACTG, encode them, concat, and compare against the result against that of the encoding of the two strings concatenated. Sure enough, a mishandled edge case popped up after trying a couple hundred strings. There was probably no way I could have found that bug without this approach.

It's a really simple concept, but you need a couple of things: first, a very wide input space. Next, you need an alternate way of verifying the result. If you just want your program to not crash, you get this for free. In my case above, I could apply the two functions (concatenate/encode) in the other order to easily calculate the expected result. If you have these two things, a fuzzer is the logical thing to write to make sure your code is rock solid.


Generic checklists for Pull Requests are great. It's easy to write a Tampermonkey script that lets you paste a prewritten checklist into a new blank PR description, reminding you to do linting, run tests locally, and so on.


https://help.github.com/articles/creating-a-pull-request-tem...

You don't need to use a tampermonkey script, they support templates :)


I'll take this opportunity to recommend danger (https://github.com/danger/danger) and danger-js (this is a rewrite of danger where most development is now being focused - https://github.com/danger/danger-js). They allow you to automate checking your PR checklist on multiple platforms and automatically send reminders to people in the form of comments instead of replying yourself every time.

It's different from other static analysis tools that integrate with pull requests for a few reasons:

- It opens up a programming API, not just a configuration and rule system, which grants you more flexibility

- It's tailored to send specific messages for errors so that you don't have to rely on the PR submitter to interpret errors and warnings

- It also allows you to look at things like PR details and issue details, so you can check things like if a PR has an issue attached and much more

- It's not restricted by language, so you can do things like check if someone updated a certain code file but did not update a certain markdown file and warn them that documentation may need to be updated

Overall it just really cuts down on the endless cycle of submitter makes change > wait for reviewer to be available > reviewer gives feedback > wait for submitter to come back > submitter makes change. I really want more people to use it so it grows.


Decent unit tests are a big part of the answer. So are other forms of testing, such as checking results via DOM in headless Chrome. Not all of it, of course. But these things essentially form a kind of automated checklist.


For routine tasks that are repeated with a narrow set underlying actions, checklists are easily devised. For something with as many degrees of freedom as programming, it's just too difficult to come up with something generic. Something more doable is having a checklist for your particular project, so everyone follows similar conventions and can add to the list as flaws in the methodology are discovered - a lot of projects operate this way already, so I think we're doing the best we can!



Human error is what makes programming difficult. You can watch a computer do billions of calculations flawlessly, while a human cannot type a few hundred words on average before they make an error of some kind. Developing software is not amenable to simple checklists, and is a very creative process. What would be more helpful than checklists are improved languages that make it much harder to make a simple mistake.


I could see developing a checklist of things to look for during code review. It might be fairly tedious since you’d probably need to scrutinize each line.

Code review guidelines or style guides come close to being checklists. There are also various kinds of static analysis like CheckStyle, FindBugs, etc. that check for common mistakes.

Of course, statically typed languages have type checking built in that prevents mistakes. Plenty of languages already have type systems and/or object constructs prevent the class of problems that led to Heartbleed (bounds checked arrays IIRC).


Testing alone will not preserve knowledge, just safety. Knowledge requires an explanation of what the tests are doing and how the tests detect it.


> He went on to explain that every time there was an accident, no matter what the cause, the final action was to review the checklists to see if that cause could be prevented.

Isn't this principle similar to TDD? For any bug, first, add a test that fails, then make it pass.


It's more like post deployment testing. Find a bug in deployment, write integration tests that replicates it, unit tests that isolate the failing component, add them to your test system, and make them a routine part of your testing schedule to prevent regressions.


Checklists have been essential and efficient tools for both learning and sharing how to write good software. I find that the inconvenience is surpassed by the benefits, especially in the long run.


Do you have any pointers to where one might find some of these checklists? I have been building my own but they are still very sparse.



There's also MISRA C++ which the AUTOSAR guidelines are an extension of


> [...] imagine having them in an automated check list. Buffer overruns, off-by-one errors, uninitialised variables. Already compilers warn you if there's a chance you used "=" when you meant "==". (If only JavaScript had a way of knowing if you meant "===" instead of "==").

Wait, a compiler that can differ between an assignment and a comparison? Witchcraft! What do those crazy computer scientists invent next? Type checking? Range checks? Testing pre and post conditions?

Fools! Computers will never be able to do that! Academic bullshit!

/s (just in case)


To the downvoter who didn't get the joke: Didn't you notice that the author's examples are exactly those mistakes that a compiler could find?

Checklists and unit tests and TDD are useful. But how about first using the right tools? I don't think they tried to repair airplanes with jack-hammers in WWII, so we shouldn't try to program complex systems with Javascript and other low-level languages.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: