New GitHub Tool Lets Coders Build Software Like Bridges

dhatch387 · on Feb 5, 2016

Surprised Wired is writing about this. Scientist is a great release from GH. I think the technique is powerful, but not particularly novel.

tartuffe78 · on Feb 5, 2016

Seems like native advertising

captn3m0 · on Feb 5, 2016

This is what I thought as well. I'd seen it yesterday, but couldn't correlate that this article would be talking about the same tool after seeing the headline. "Lets Coders Build Software Like Bridges" doesn't make it sound like a refactoring toolkit at all.

Are bridges refactored 5 years after being built?

gravypod · on Feb 5, 2016

Now that you mention it, it does seem rather odd.

brudgers · on Feb 5, 2016

Repository: https://github.com/github/scientist

jdbernard · on Feb 5, 2016

Scientist is cool. It's a little annoying that the article makes it seem like no one has thought of this before Github. But whatever, maybe more people will learn about the pattern and use it.

The other implicit conclusion that the article is making, that this will somehow make software more like a traditional engineering discipline also makes me a little uncomfortable. There still is no silver bullet. http://c2.com/cgi/wiki?NoSilverBullet

anentropic · on Feb 5, 2016

is there a generic name for this type of thing? not sure how to google for existing implementations

jdbernard · on Feb 5, 2016

The article, and the Github repo use the name Branch By Abstraction[1]. It is related to the idea of a Strangler Application [2]

[1] http://martinfowler.com/bliki/BranchByAbstraction.html

[2] http://www.martinfowler.com/bliki/StranglerApplication.html

DonHopkins · on Feb 5, 2016

Oh that's a great name for a pattern. I can't wait for Craigslist to release their software migration tool, the Craigslist Strangler.

"There's another important idea here - when designing a new application you should design it in such a way as to make it easier for it to be strangled in the future. Let's face it, all we are doing is writing tomorrow's legacy software today. By making it easy to be strangled in the future, you are enabling the graceful fading away of today's work." -Martin Fowler

Lx1oG-AWb6h_ZG0 · on Feb 5, 2016

Isn't this just a tool to enable double blind experiments on old vs new software?

robgibbons · on Feb 5, 2016

Forgive my ignorance, but how does this approach differ from traditional unit testing or end to end testing?

My point being that unit testing essentially gives you the same confidence in your refactoring efforts that Scientist proposes to offer. Tests already demand that your interface remains the same, and that old code does no harm when replacng legacy.

rhinoceraptor · on Feb 5, 2016

This [1] is a good writeup on how they used it to replace the merge that github uses. You can't really capture the totality of corner cases in unit tests, especially for complicated things.

[1]: http://githubengineering.com/move-fast/

jowiar · on Feb 5, 2016

"100% test coverage" doesn't imply "won't cause real-world things to break". That your tests run all the code doesn't mean that your tests cover all the actual cases. This identifies bugs and performance regressions on a wide assortment of real-world cases.

endergen · on Feb 5, 2016

Agreed that writing test coverage and running twin systems against would get you this. But the benefit is that when you have no or low test coverage, if you have a drop in way of just testing behavior matching then it's probably a more economical approach in many projects. Also, I doubt any test suite exists with the variance of inputs that live user testing generates, including error cases.

anentropic · on Feb 5, 2016

I was wondering the same, but I guess the advantage is being able to safely-ish run tests in production against production data and other live services

the sort of stuff you usually mock out in unit tests. and may not be feasible to have a whole copy of your prod 'big data' and services just for testing on

with Scientist you get a report not just of mismatched results and exceptions (stuff good tests should usually be able to catch) but also if the new code is slower

lunula · on Feb 5, 2016

The name "Scientist" is about as good as "Engineer" or "Programmer." In other words, extremely confusing when used as a product name in a technical domain!

hnbrox · on Feb 5, 2016

the name is a bit annoying. seems very dude-cool-name-syndrome.

wanda · on Feb 5, 2016

They really hammer it home with the emoji commit messages.

There's also exhibit C:

From Github repository "Scientist" : README.md

    "How do I science?"

brightball · on Feb 5, 2016

I got really excited about this thinking that it was something that worked outside of the code base as an interface proxy that could work cross-language.

ivansavz · on Feb 6, 2016

Yes, I was imagining a "traffic demultiplexer" of some sort that could feed the old and new backend in parallel and observe they have the same behaviour.

I think something like that would be useful for a lot of people in industry who have to maintain legacy monoliths. The ideal tool would allow "switchover" from old to new gradually, one URL endpoint at a time.

This can be used to replay traffic https://github.com/buger/gor/ so can do load testing at least...

markbnj · on Feb 5, 2016

What is the difference between this and writing your services to an API that runs behind something like haproxy that can be quickly configured to direct requests to whatever stack you like?

jbrooksuk · on Feb 5, 2016

Scientist seems to run both versions of the code.

markbnj · on Feb 5, 2016

Both versions of an API can run behind haproxy as well, and requests can be switched between them based on many different criteria from OSI layer 4 on up. I guess I just don't see what was novel or interesting about the basic idea of switching between two implementations that satisfy a declared interface. It was just a few years back that DI was all the rage.

GreaterFool · on Feb 5, 2016

tl;dr small (and perhaps cool?) Ruby library?

Headline got me excited but then I ended up at the GitHub page.

raimue · on Feb 5, 2016

The analogy does not seem to fit. Of course they built the new bridge before tearing down the old one. There is not really an option to divert traffic, it has to cross the bay.

The Scientist software copies the input and feeds it into two systems in parallel. There is no analogy for that in real life. There is no way to copy cars.

And I really hope they do not actually need to run load tests for bridges after they have been built.

libria · on Feb 5, 2016

Lot of comments here attacking the analogy. It's just a throwaway analogy. Can we just overlook non-essential parts of the article and discuss the code and techniques instead?

advisory5739f2 · on Feb 5, 2016

I feel like I am being mislead with a sensationalist article. The tool has already been discussed on HN.

Software engineering is extremely unlike bridges in that bridges get constructed once. Software gets constructed (from source to machine code) every time you build and run it.

pc86 · on Feb 5, 2016

Of course not. Why discuss something on its merits when we can attack an analogy by a reporter?

stepanhruda · on Feb 5, 2016

> On June 14, 1874, John Robinson led a "test elephant" on a stroll across the new Eads Bridge to prove it was safe.

https://en.wikipedia.org/wiki/Eads_Bridge

brandonmenc · on Feb 5, 2016

> She envisions aging banks using it to upgrade decades-old Fortran code to Ruby or any other modern language.

I know it's just an article, but...

If anyone thinks they should replace working Fortran code that does numerical analysis with Ruby, they should be fired.

gravypod · on Feb 5, 2016

You have to conflicting issues on this point.

The first being that the code works fine, the deployment works fine, everything is working just fine. Do. Not. Touch. It.

These are some of the most financially important systems in America to date. It's not a question about if they have to work, it's a question about just how bad things would get if they didn't work for even a day.

The next problem that comes up is what happens when someone finds a bug 60, 40, or even 10 years in the future? What happens when some of this old hardware breaks and it becomes not only impossible to find replacement parts but also impossible to fabricate new ones in house.

Eventually, like all languages, Fortran will die out as time goes on. So few people know it now that it is inevitable.

Eventually something has to give and we can do two things until then.

1. Wait for time to pass and pretend nothing will ever happen to these machines.

2. Slowly start rewriting all of the stack, and transferring everything to more modern systems designed to be just as long term as their predecessors so that in the event of a replacement being needed it exists.

I don't know about anyone else, but I like to have backups/fallbacks when planning for the future.

I'm not saying that Ruby or any new language should be entrusted with this kind of responsibility. But there needs to be a decision and it needs to be made over the next couple of decades.

plq · on Feb 5, 2016

> What happens when some of this old hardware breaks and it becomes not only impossible to find replacement parts but also impossible to fabricate new ones in house.

You get a modern Fortran compiler, you compile the old code for the new stuff and off you go. What am I missing?

> Eventually, like all languages, Fortran will die out as time goes on. So few people know it now that it is inevitable.

Like ALL languages? Ha, tell that to the COBOL programmers at the heart of some financial institutions mate. Heck, RPG is still supported on some IBM hardware.

gravypod · on Feb 5, 2016

I'm talking about languages on a larger scale. Do you mean to tell me that COBOL or Fortran will last _forever_?

All languages die out eventually. It does not matter that it is a programming language, the concept will remain.

Nothing is perpetual, nothing lasts forever. I hate to break it to you, but eventually something will give.

ska · on Feb 5, 2016

   "Slowly start rewriting all of the stack, and transferring everything to more modern systems designed to be just as long term as their predecessors so that in the event of a replacement being needed it exists."

This process is obvious, and has been going on for decades already.

There are challenges of course, but thinking the problem is coding infrastructure and the lack of "modern" languages like ruby is a category error.

gravypod · on Feb 5, 2016

I agree, that is not the way I feel.

You must remember that relatively speaking compared to Fortran, C is a "modern" language.

brandonmenc · on Feb 5, 2016

> You must remember that relatively speaking compared to Fortran, C is a "modern" language.

In what way?

Fortran code is certainly not lower level than C and in many cases higher level. Furthermore, they are closer in age, for example, than Smalltalk is to Ruby.

gravypod · on Feb 6, 2016

Fortran: Created 1957 C: Created 1972

And those where not simply 15 years, they were a very important 15 years of development.

Relatively, C is more modern than Fortran. It is also certainly more widespread.

brandonmenc · on Feb 5, 2016

How about a third option: keep training new programmers on Fortran and improve tooling - just like we do with every other language (and, afaik, this is happening with Fortran.)

Most Fortran programmers I know aren't programmers by trade - they're scientists or analysts who need to crunch numbers. Fortran doesn't need to be as feature complete or cutting edge as more general purpose languages.

tholford · on Feb 5, 2016

I used to work at a TBTF financial institution. During orientation, they told us that their mainframes process roughly 1/3 of US credit card transactions. The low-level code is 40-year old Fortran that still performs swimmingly.

cyansmoker · on Feb 5, 2016

People think of Fortran and Cobol as similar, due to their age. Put simply: they are not.

Dirlewanger · on Feb 5, 2016

Yeah, this scared me.

I'm in webdev, and while it does have its worrying trends with regards to the larger trade of programming, they are largely insular and contained. Only way to explain this kind of behavior is to have worked your entire career in webdev (with horse blinders on) and have next-to-no understanding of computer science. That, or she doesn't give a shit because Github is paying her a fuck-ton not to.

I'm going to guess in this case, it's the latter.

_5vzs · on Feb 5, 2016

It's a shame GitHub puts effort into stuff like this as opposed to actually listening to their users (e.g. "Dear GitHub"). We can infer a shift from GitHub to alternate services soon from this, unless they change.

skewart · on Feb 5, 2016

To be fair, developing something like this is probably orthogonal to developing the kinds features they have been neglecting to develop recently. Scientist was probably entirely within engineering, and not a massive effort. Making the things users are asking for would be a much bigger effort across the organization.

But, I do completely agree that they've been dropping the ball lately.

jowiar · on Feb 5, 2016

This is Github open-sourcing an internal tool, which is something they have a long history of doing (Jekyll, Resque, Hubot).

Most dev shops have a whole bunch of things like this -- assorted libraries and internal tools that are 80-90% of the way to being somewhat useful to the public. It's fantastic that Github provides the time/space/expectation for employees to finish them.

riyadparvez · on Feb 5, 2016

I don't see how this is relevant. Do you want to see Github implement your favorite feature, but at the same time the bugs introduced in the new version breaks the existing functionalities? That's what this library does. It ensures new bugs do not take down existing functionalities.

Multi-version execution is an active area of research [1].

[1]: http://srg.doc.ic.ac.uk/files/papers/mx-icse-13.pdf

gedrap · on Feb 5, 2016

I remember seeing this project more than a year ago (I believe it was call dat_science back then).

>> We can infer a shift from GitHub to alternate services soon from this, unless they change.

Nope. People are lazy. Will a lot of people write comments and all, how they are going to totally move away from github? Oh yeah, no doubt. But will they actually do it? Nah, just a few, unless github becomes as bad as source forge but I don't think it's going to happen, even then it would take some time.

Yes, yes, I am aware that migrating is just a matter of adding a new origin. That's the easy bit. But getting used to a new interface and everything, that takes much more effort and time. Writing a comment is easier (especially when tons of people are doing that around you) than changing your habits.

Even ignoring everything above, I don't really agree with the "puts effort into stuff like this as opposed to actually listening to their users" attitude. GitHub is not a one man's project. The whole company shouldn't drop everything just to do something right now because it was trending on HN a week or two ago. A company like this (making substantial product changes rather urgently based on some hype) maybe sounds cool but, in reality, you would probably not be excited to use their products. Urgent changes in features and quality often does not go together.

mannykannot · on Feb 5, 2016

This is a really unfortunate choice of an example - the replacement bridge has suffered enormous cost overruns.

http://www.citylab.com/politics/2015/10/from-250-million-to-...

thesz · on Feb 5, 2016

Nobody mentioned Erlang so far.

My biggest disappointment with "Scientist" is that it basically provides service which Erlang code has for free - hot code reloading. For decades now. Along with other cool features like great support for parallel execution, hierarchical monitoring and so on.

Article mentions some early user raves about this tool in context of "it allows me do refactoring!"

Of course it would help, but there are languages that either have all this cool stuff for decades (Erlang), or can allow you to do heavy refatoring without asking for help from "interesting" tools like this "Scientist" (Haskell).

Qwertious · on Feb 5, 2016

"The greatest form of flattery is imitation."

Huh, does that apply less if they've basically never heard of you when they imitate you? Not to imply that Erlang is unheard of, I'm just saying it's not as popular and orthodox as Python/Ruby/C/java/etc.

thesz · on Feb 7, 2016

There is a variant of hot code reloading (dynamic binary program patching) for C.

http://www.cs.umd.edu/~hollings/papers/apijournal.pdf

And there even older one (I cannot find because search in my blog (LJ) is broken).