inventitech's comments

inventitech · on March 12, 2019

I am not sure which implementation Excel uses under the hood -- Excel's flash fill feature has been there for quite some time; PROSE (which strans uses) has not, and Excel is not listed in its impact section. So this seems to hint at the fact that the engine Excel uses is different. Anyhow, the idea behind both definitely comes from the same team, and Sumit Gulwani in particular.

ygra · on March 12, 2019

Oh wow. I've learned about Excel's Flash Fill only recently and wasn't even aware that the same thing is available as a library as well.

PowerShell's Convert-String is very rough around the edges, though. Seems more like a proof of concept that ended up slipping into a release.

inventitech · on March 12, 2019

Yup, perhaps that's why it's missing in the latest PowerShell release ... https://docs.microsoft.com/en-us/powershell/module/Microsoft...

apskim · on March 12, 2019

To clarify: Excel FlashFill uses an earlier version of this engine, with a similar but less advanced algorithm. The current engine that powers strans (PROSE) can, for instance, solve the infamous "January" task without errors

inventitech · on March 13, 2019

Thanks for the explanation. And hi, Alex (cool to meet again here on HN)!

lysium · on March 12, 2019

I like that idea!

inventitech · on March 12, 2019

Oh, I now see the ambiguity you refer to. What about this:

Strans – an alternative to sed that automatically learns from examples, instead of having to program yourself

Unfortunately, I can't change the title anymore

O_H_E · on March 12, 2019

I think you can mail "hn@ycombinator.com" for title change. You might also ask for their input.

Strans – sed alternative that automatically learns a string-transformation from an example

Strans - auto text manipulation from sample input-output

sed without the obscure part

dang · on March 12, 2019

We've changed it above. Happy to change again if you prefer (it just needs to be 80 chars or less).

maxxxxx · on March 12, 2019

That’s much better.

inventitech · on Dec 17, 2014

The first step is just to observe how much time is spent on testing. We have basically no clue about this. And our first study with students has also shown that they had no clue about it.

In a second step, we can think about which implication testing time might have on quality. As you said correctly, more time does not necessarily correlate with better quality or higher productiveness. Maybe there is a certain range of testing efforts that can be associated with good quality tests? E.g. if you spend less than x on testing, your tests are likely to be bad. If you spend more than y on testing, it might be worth investigating whether you have unusually high testing targets. Or your tests might be extremely hard to maintain.

I think the answer is Janus-faced and there is no single, simple answer (see also "Testivus on Test Coverage", http://www.artima.com/weblogs/viewpost.jsp?thread=204677).

inventitech · on Dec 17, 2014

I generally agree with you.

"sometimes easy testing is not actually useful and useful testing is cost-prohibitive."

While I agree here, too, I think that the situation where testing is not useful happens very rarely in practice. Everything can break, and when you think it cannot be possibly be wrong now, a future regression might occur.

I think badly written tests are a different problem. I.e. tests that for example assert on the serialized string and hence take a lot of maintainability effort to adapt to changing production code. But then there is nothing wrong with the test per se, its just badly implemented.

IgorPartola · on Dec 17, 2014

I can think of lots of cases where testing is not useful in practice. Simulating random filesystem corruption when writing a Node.js app is pretty much useless: you don't know which files will be corrupted, and testing would take a huge (billions of lifetimes) amount of time to simulate all possible combinations even for a small program. Yet, this can (and in my experience did) take down a production system.

My other examples I already listed, I think, are examples of this too: you can test all you want against a spec provided by your external API maintainer, but if they don't follow the spec, you have a problem. We don't have a good framework (and I don't think we can create one), for testing layout problems in HTML that are caused by bad JS or CSS.

Basically, you do hit diminishing returns quickly, and you do get a false sense of security by having lots of small unit tests that don't actually prevent realistic failures.

inventitech · on Dec 17, 2014

Yes, this is certainly true to a (I think) minimal extend. Why? Because improving the time you spend on testing is something that is essentially hard to do -- and only a meta-information, anyway.

An example: A quality assessment of your code tells you that your methods are too long. This gives you a concrete task to do: You simply split all too-long methods. However, with testing time, this is not so: Increasing the effort spent on testing is something you cannot "just do" without a sensible plan. If you want to increase this metric, you have to start to think what is wrong with your current test strategy (maybe nothing's wrong at all), and come up with an action plan of how to do.

If you're that far in the game already, I think this metric has done what it can do. Plus, you'll likely see an increase in the metric. Which is justified.

Coming up with (critical) test efforts in the wild is our task now.

inventitech · on Dec 17, 2014

> But I'm not quite sure if there is a direct correlation between time spent writing tests and overall software quality?

Good point. We do not know, either. That is part of the reason why we do this research. :)

As has been said before, the easier to understand and maintain your tests are, the less amount of time you actually need to spend on them.

However, you could argue, that there must be some minimal amount of time that you absolutely need to spend on QA work like testing to ensure a certain quality.

We are investigating this.