As someone who has used and contributed to Julia, I find Yuri Vishnevsky's argum...

pkofod · on Dec 8, 2022

Software has bugs. That's the way it is. You may think that Julia (but I suppose this is mostly about the ecosystem of packages around Julia) has too many bugs. Then you can use something else. Like Python. If you move from Julia to Python, you may want to use Numpy? Pretty cool project. It currently has 1,9k issues on Github and if you filter by bugs, it has 599 such labeled issues. How many of those are issues like in the post? I don't know. The same applies to Scipy. For example, the gaussian hypergeometric function returns wrong results for some input values https://github.com/scipy/scipy/issues/3479. This issue was filed in 2014. You can find similar old issues in Julia packages. That's how these things go. Luckily, many of the issues listed in the blog post are fixed.

If you think that picking any language and any library combination with a semi-high requirement for the number of features you want to be already implemented will be able to fulfill the "this has to be completely correct or I won't use it for my research"-requirement you will have a hard time.

The last part of the post seems to be about OffSetArrays.jl. Many people who have implemented libraries and who care about composability and generic input also agree that the Base AbstractArray interface is not perfect or complete and sometimes the issue is that the interface that does exist is not followed well enough for composability to work. A more complete, agreed upon, and generally adhered to interface for an "AbstractArray" would be nice and has been+is being worked on and discussed by many people in the community.

tomrod · on Dec 8, 2022

Scipy (I believe Numpy parent project) just straight up has wrong formulas in some of its methods because they are legacy and changing them would change people's codes.

Hamming distance was one, if I recall correctly, that wasn't correct, as well as a few others in its parent module.

I still use the package, of course, because its great. But given the disconnect I saw I'm still careful when documenting as to what method is used (and use a comment to clarify). Most of the time it isn't a huge deal.

WalterBright · on Dec 8, 2022

> is if it can't be trusted to return correct results

If you need correct results, the trick is to not trust anything. Any results that come out need to be validated. This can be done by running the equations backwards to see if the initial conditions and/or boundary conditions hold. For example, when I wrote matrix inversion routines long ago, I'd check that multiplying the original by its inverse produced the identity matrix.

I.e. do not trust. Verify.

teruakohatu · on Dec 8, 2022

This undated blog post (published in about May this year) is not the last word on Julia. There have been fixes made to issues brought up.

Julia's authors get credit for trying to rewrite many core compute/LA libraries from scratch in a composable manner rather than using ancient fortran code nobody dares touch. That said, the article was a good wake up call to the community that tests and correctness is vital for trust.

Edit: First paragraph removed as the person I referred to disagrees with me and asked me to make an edit.

ChrisRackauckas · on Dec 8, 2022

> Julia's authors get credit for trying to rewrite many core compute/LA libraries from scratch in a composable manner rather than using ancient fortran code nobody dares touch.

You can go even further than that. OpenBLAS, the BLAS linear algebra library that things like Julia and NumPy use by default now, was an MIT Julia Lab project. The developer was a research software engineer in CSAIL under Alan Edelman! And then the openlibm open source math library? Also made by the Julia developers. Julia developers like Steven Johnson and Doug Bates are also the ones behind some of the ancient classic codes like FFTW and nlopt. So in many cases they aren't just rewrites but rewrites by authors of the some (but not all of course) of the ancient codes!

(And many of these are C)

kgwgk · on Dec 8, 2022

> OpenBLAS, the BLAS linear algebra library that things like Julia and NumPy use by default now, was an MIT Julia Lab project.

In what sense? The only "official" link that I can find is that one of the developers spent part of 2016 as a postdoc at MIT.

ChrisRackauckas · on Dec 8, 2022

Yes, specifically to change focus from academic work to being "full time" on OpenBLAS a bit and clean it up as it started getting a lot more traction. Shortly after completing that he passed on the maintainership of OpenBLAS its growth, but that was how it sat until came along late in 2017 to pick it up from there.

This ends up being why a lot of people in the Julia Lab learned the internals and started creating pure Julia BLAS implementations, which continues to this day with things like LoopVectorization.jl and Octavian.jl (though Chris Elrod was not around during that time, so he comes with his own influences)

moonchild · on Dec 8, 2022

> And then the openlibm open source math library? Also made by the Julia developers

openlibm is just (yet another) msun repackage, isn't it?

StefanKarpinski · on Dec 8, 2022

It was cobbled together from msun and other sources to make a complete, portable, open source, liberally licensed libm, which—shockingly enough—did not exist in 2010 when we put it all together.

teruakohatu · on Dec 9, 2022

Thanks for all the hard work you and your colleagues put into Julia!

fhoituwoeir324 · on Dec 8, 2022

>... using ancient fortran code nobody dares touch.

Only the BLAS/LAPACK reference implementation are in F77. Intel MKL, OpenBLAS, ATLAS and other are all in a mix of C and ASM.

naniwaduni · on Dec 8, 2022

> This undated blog post (published in about May this year) is not the last word on Julia. There have been fixes made to issues brought up.

And yet this is an anticipated reaction:

> Whenever a post critiquing Julia makes the rounds, people from the community are often quick to respond that, while there have historically been some legitimate issues, things have improved substantially and most of the issues are now fixed.

> For example:

> [...]

> These responses often look reasonable in their narrow contexts, but the net effect is that people’s legitimate experiences feel diminished or downplayed, and the deeper issues go unacknowledged and unaddressed.

> My experience with the language and community over the past ten years strongly suggests that, at least in terms of basic correctness, Julia is not currently reliable or on the path to becoming reliable. For the majority of use cases the Julia team wants to service, the risks are simply not worth the rewards.

Joel_Mckay · on Dec 8, 2022

If I recall, the same argument was used to justify Octave's Fortran coded libraries for stats. Primarily, it was an argument about historical consistency in the assumptions people agreed upon at NASA.

Julia's library compatibility is still an issue, but hardly related to the language itself. Unlike many languages, Julia has regression testing for its graphical plotting output. This means it can and usually does check that a given output matches a previously known function(s) output across each release iteration. This means that unlike many languages that fix/ignore various external library versions, Julia actually checks that data gives the exact same results every time.

The caveat is that people must report issues to the project, or add regression tests to ensure correctness persists throughout the ecosystem. =)

adalacelove · on Dec 8, 2022

I just read the blog post. The first issue was acknowledge on the same day and fixed on a commit two days later. All issues were noticed fast and corrected in the following months.

Now, there are of course much more mature options if that makes you comfortable, which is totally respectable, but saying there are bugs in the programming language borders spreading FUD.

shakow · on Dec 8, 2022

The problem with this family of bugs is not how fast or easy they are to fix, it is how hard they are to find and how unexpected they are.

adalacelove · on Dec 8, 2022

> how hard they are to find and how unexpected they are.

Coming from C two of them were expected for me (products of Int8) and aliasing of arrays on sum! But I suppose that's debatable

ChrisRackauckas · on Dec 8, 2022

You can find them by running Julia with `julia --check-bounds=yes`. That seems quick?

shakow · on Dec 8, 2022

If you have to check inboundness for all combinations of libraries, that's not really quick.

throwawaymaths · on Dec 8, 2022

> I do mostly scientific computing, and the idea that I might have to retract a paper because of a bug in the programming language is intolerable

How do you feel about bugs in the number system because let me tell you IEE-754 doesn't give correct results.

Hate to say this but checking for correctness is your job. There is no programming system that will be 100% correct. Yes, it sucks when a language might break your mental model of how it works (and I'm not speaking for Julia, I don't use it anymore, it's possible the correctness bugs are egregious) but you should be writing extensive tests if you're truly worried about having to retract a paper. If you, say, use python, you better triple check that no dict is getting mutated under the hood when you pass it into a function, etc.

canjobear · on Dec 8, 2022

> If you, say, use python, you better triple check that no dict is getting mutated under the hood when you pass it into a function, etc.

That's a bug in your code, not in Python.

The Julia correctness issues are things like basic math functions, called in simple normal ways, returning wrong numbers.

staticfloat · on Dec 8, 2022

> The Julia correctness issues are things like basic math functions, called in simple normal ways, returning wrong numbers.

If you have an example of this, I'd be interested in tracking it down. I didn't see this in Yuri's blog post, nor am I aware of any egregious examples of this right now.

canjobear · on Dec 8, 2022

I was thinking of the first example he links, with the Dirichlet density.

celrod · on Dec 8, 2022

That is a closed issue from a Julia package. I'm sure I could go to popular packages from any language and find bug reports.

bb88 · on Dec 8, 2022

> Hate to say this but checking for correctness is your job.

Yes. But also No.

The "No" case being if you run your data through a well established program or library already known in academia to have been scrutinized to give accurate results to the 20th decimal place of precision.

jiggawatts · on Dec 8, 2022

I read through a few of those issues, and they're mostly (all?) bugs in the standard library, not faults of the language itself.

jakobnissen · on Dec 8, 2022

The distinction is pointless when you care about writing code that gives the right answer. What are you going to do - write Julia code without using the standard library?

adalacelove · on Dec 8, 2022

It's obviously not pointless, as bugs in the library are easier to correct

jakobnissen · on Dec 8, 2022

Not particularly. Unless the bug is in some complicated compiler inference code, the Julia stdlib is just like any other Julia package. If you can fix it in a library, you can fix it in Base Julia.

singularity2001 · on Dec 8, 2022

it's not pointless when you look into the future:

julia having bugs is very different from other languages being fundamentally conceptually flawed (like perl ruby c?)...

some of those bugs were already fixed

ChrisRackauckas · on Dec 8, 2022

Let's at least be clear what happened in the examples of the post. Julia has correctness checking, and it's turned on by default. The only way to turn this kind of stuff off in normal usage is locally, i.e. putting `@inbounds` and such on a block of code. That blog post is about some cases where in some libraries (not even standard libraries really, some statistics libraries), there were people who put `@inbounds` in code to explicitly turn off safeguards but did not do the due diligence to ensure that the code that was unsafed was actually correct.

Note a few things here. One, this is some user deciding to turn off the safeguards, not necessarily Julia being unsafe but instead Julia being flexible and some users abusing that without checking it well. Two, you can run Julia in a way where this feature is disabled: just running `julia --check-bounds=yes` and all `@inbounds` markings are ignored. So in theory this entire blog post could've just been "please run Julia via `julia --check-bounds=yes` and you get better error messages". Note this isn't something that's obscure: literally every time you run `]test` or use package CI it puts things into this mode, so it's used daily by almost all developers. Third, you say you do scientific computing but this post was about statistics libraries.

Look, it's not perfect. What I'm hoping for is that Julia's compiler soon improves its tracking of certain effects like size and uses that to emit code without bounds checking when it proves the code stays inbounds. If that's the case, then `@inbounds` wouldn't be necessary, and this would go away completely. But for now, we have a system when when you need to, you can opt out, and users can explicitly turn off all opt outs with just a single command line argument. I think that's at least a better solution than say MATLAB where indexing out of bounds resizes the array without an error, or indexing out of bounds in Python or R can just wrap around and give you a value without erroring. At least Julia chooses safety by default, and we need to bonk a few people on the head to stop turning off the defaults before checking correctness (in the name of "maor performance", which could soon be replaced by the compiler simply doing this as an optimization and thus further reducing the need for people to do it).

And another change that can be done to Julia here is that all of the features that can be unsafe, like `@inbounds` and `@fastmath`, can be moved to an Unsafe.jl standard library, so that packages have to explicitly import Unsafe.jl and you can check the Project.toml to see if a package uses it. That paired with command line overrides to disable any of these features would give very clear warnings and workarounds for users. What you'll find with this is that there is a very small set of packages using it. And unlike some other languages, Julia makes these unsafe triggers local, so hopefully we can further localize and cage any usage of this to be a very small surface area. To me, that seems like something that is at least solvable, in comparison to other language designs where such unsafe behavior is global to the whole language (I'm looking at you Python, https://moyix.blogspot.com/2022/09/someones-been-messing-wit... bites so hard every time I try something...)

cmcaine · on Dec 14, 2022

Also, several of the buggy uses of @inbounds were safe at the time they were written (early <1.0 julia when arrays were always indexed from 1).

jay_kyburz · on Dec 8, 2022

This reminded me of how excited I was by Godot game engine, then after only three months or so I lost all faith in the developers and moved on.

If you are building a tool you expect other people to use, "correctness" needs to be priority number one. After that you can think about speed and aesthetics.

jakobnissen · on Dec 8, 2022

Practically speaking, what's the difference between your program giving incorrect results because you wrote a bug in your code, and it doing to because of a bug in a third-party library, or in the language itself?

I mean, the end result is the same. In all cases it's debuggable - the language itself is also just a large piece of software. And if these things are practically the same, and 90% of your bugs come from your own code, and 90% of the remaining ones from third-party libraries, how much do the last 1% bugs really matter?

mattpallissard · on Dec 9, 2022

I used to work at an HPC research institute. While I agree with your sentiment most statistical software has bugs that effect results.

The panda's bug tracker stressed me out on a daily basis. I've also found bugs that effected results due to missing instructions between processor architectures, erasure coding in wire protocols for storage systems, and even a kernel bug.

But by far the most common was by the author of the model themselves.

Software has bugs, no exceptions.

leephillips · on Dec 9, 2022

Except TeX.

eesmith · on Dec 9, 2022

FWIW, what might be the last bug in TeX was reported 22 October 2020. https://www.tug.org/TUGboat/tb42-1/tb130knuth-tuneup21.pdf

tkuraku · on Dec 8, 2022

Agreed. Correctness needs to be the number one priority.

ChrisRackauckas · on Dec 8, 2022

Indeed. Though for Julia, the issue is that there is a way to locally disable correctness checking (`@inbounds`) that some developers have gone too far with. But there is also a global way of turning it off (`julia --check-bounds=yes). This is at least a better position than say something like the Python ecosystem where there are global correctness overrides that can be disabled in a way that effects every single library (https://moyix.blogspot.com/2022/09/someones-been-messing-wit...). Julia needs to improve this more, as I posted in detail two features that would go a long way to improve this, but singling out Julia seems to really miss the context of where these features are derived from and how it relates to other languages.