More

staticfloat · 2025-12-08T01:53:16 1765158796

I love that this article includes a test program at the bottom to allow you to verify its claims.

staticfloat · 2025-10-15T17:09:09 1760548149

This is a very old answer about the M1, but yes what you’re saying is possible: https://stackoverflow.com/a/67590869/230778

staticfloat · 2025-08-14T21:49:22 1755208162

While you can cause the compiler to run longer to squeeze the binary size down, the compiler has a baseline number of compiler passes that it runs over the IR of the program being compiled. These compiler passes generally take time proportional to the input IR length, so a larger program takes longer to compile. Most compiler passes aren't throwing away huge amounts of instructions (dead code elimination being a notable exception, but the analysis to figure out which pieces of dead code can be eliminated still is operating on the input IR). So it's not a perfect proxy, but in general, if the output of your compiler is 2MB of code, it probably took longer to process all the input and spit out that 2MB than if the output of your compiler was 200KB.

staticfloat · on May 6, 2024

Just a small piece of feedback, it looks like there's a small typo on the last paragraph of the page:

> Interviews can be scheduled (and rescheduled) interviews around the clock, 7 days a week.

I believe the second `interviews` should be omitted.

cporios · on May 6, 2024

Thank you!

staticfloat · on April 30, 2024

Other posters have pointed out that this is incorrect, but I wanted to give a bit of intuition as to how signals can be received when they are below the noise floor.

First, as a definition, below the noise floor means that the power of my signal at any given time is smaller than the power of the ambient noise in my channel, and usually this implies that you're only interested in a particular segment of frequency spectrum (e.g. within the 10MHz band centered at 1.8GHz). If we were doing a simple frequency-shift keying or amplitude-modulated signal, once the noise power exceeds the signal power, there is basically no hope of recovering anything useful, as those are both demodulation schemes that rely upon obtaining instantaneous estimates of the frequency or amplitude of the signal of interest.

However, spread-spectrum methods make a time/frequency tradeoff, where the signal of interest is "spread" across multiple points in time and frequency. A very simple example of this is to say "if I want to transmit a 1, instead of transmitting one cycle of a sinusoid at 18.GHz, I will transmit 10 cycles". Then, at the decoder stage, you average across 10 cycles of your carrier in order to detect whether a signal was sent or not. By doing this averaging across time, you get a 10x gain versus the noise which is expected to cancel itself out as often as not.

True spread-spectrum techniques are more advanced than this, they actually use wave shapes that are more complicated than just a sinusoid to make it easier to detect when they start and stop (whereas with a sinusoid there's a fair amount of ambiguity if you shift one period to the left or right) but the fundamental idea of averaging across time is the same.

Through this mechanism we are able to rescue out signals from far below the noise floor, although it reduces your maximum transmission rate. When dealing with digital radio systems we can even rescue out signals from below our quantization floor, although not too much lower, as eventually you lose the ability to average out a signal that is fluctuating by significantly less than a single bit.

Whenever I talk about making tradeoffs in transmission speed to aid in reception, I am reminded of the ELF systems in submarines [0]. While they did not use spread-spectrum techniques, (they just jumped between two frequencies, 76Hz and 80Hz) they still correlated across time to boost up their effective SNR. [0] https://en.wikipedia.org/wiki/Communication_with_submarines#...

codersfocus · on April 30, 2024

Or an even more concise example: fountain codes

staticfloat · on April 4, 2024

I ran into an issue like this in my first ever job! I accidentally filled up a cluster with junk files and the sysadmin started sending me emails saying I needed to fix it ASAP but rm wouldn’t work. He taught me that file truncation usually works when deletion doesn’t, so you can usually do “cat /dev/null > foo” when “rm foo” doesn’t work.

mjevans · on April 4, 2024

In shell :>filepath often works...

However sometimes filesystems can't do that. For those cases, hopefully the filesystem supports: resize-grow, resize-shrink, and either additional temporary storage or is on top of an underlying system which can add/remove backing storage. You may also need to use custom commands to restore the filesystem's structure to one intended for a single block device (btrfs comes to mind here).

JohnMakin · on April 4, 2024

I was once in a situation years ago where a critical piece of infrastructure could brick itself irreparably with a deadlock unless it was always able to write to the file system, so I had a backup process just periodically send garbage directly to dev null and as far as I know that dirty hack is still running years later.

/dev/null is magical and worth reading into

pram · on April 4, 2024

You can actually just do >file

JdeBP · on April 4, 2024

Although note that several comments here report situations where truncation doesn't work either. 21st century filesystem formats are a lot more complex than UFS, and with things like snapshotting and journalling there are new ways for a filesystem to deadlock itself.

deltarholamda · on April 4, 2024

I accidentally filled a ZFS root SSD with a massive samba log file (samba log level set way high to debug a problem, and then forgot to reset it), and had to use truncate to get it back.

I knew that ZFS was better about this, but even so I still got that "oh... hell" sinking feeling when you really bork something.

dredmorbius · on April 5, 2024

Having recently experienced an over-capacity MacOS disk, "emptying" files in this manner simply did not work.

pdimitar · on April 4, 2024

To me what works is `cat /dev/null >! filename`.

jaimehrubiks · on April 4, 2024

Great to know

staticfloat · on Oct 1, 2023

Yes, that’s what it means for me. I’ve never heard someone use the word arbitrary to mean anything other than “a random choice”, or even “a poorly thought-out choice”.

My professors in grad school explicitly discouraged use of that word anywhere in technical writing, as they felt it would immediately give the reader the impression that the actions taken in the research were not thought through. Example: “This new technique enables arbitrary manipulations of data” should instead be replaced by something like “this technique enables a wide range of manipulations of data”.

lcnPylGDnU4H9OF · on Oct 1, 2023

I’m not convinced by the argument in your second paragraph. It actually makes it seem like arbitrary means exactly “based on judgment or choice” if it can be replaced so easily with “a wide range of”. How was the “wide range” chosen, if not arbitrarily?

burkaman · on Oct 1, 2023

It means something like "any randomly chosen manipulation is possible", or "so many possibilities that I don't feel like listing them, anything you can think of will probably work".

diogenes4 · on Oct 1, 2023

> I’ve never heard someone use the word arbitrary to mean anything other than “a random choice”, or even “a poorly thought-out choice”.

That's shocking, I use it to mean "the result of a judgement or decision" about a dozen times a day, such as "it's not random, it's arbitrary". I had no clue people had an alternative definition for it. I'm even more surprised that otherwise ostensibly-educated people have no clue about the traditional definition.

burkaman · on Oct 1, 2023

Can you find any recent dictionary with your definition, or modern printed example of the word used in this way?

Etymonline cites a 1640s dictionary with the present-day definition: https://www.etymonline.com/word/arbitrary

diogenes4 · on Oct 1, 2023

> Can you find any recent dictionary with your definition, or modern printed example of the word used in this way?

sure, https://www.wordnik.com/words/arbitrary: "Based on or subject to individual judgment or preference." is the second definition on the page. You clearly didn't even bother googling.

Regardless, I'd argue that arbitrary as distinct from "random" only has one definition—the one where judgement, choice, or preference is exercised. This is the useful way to use the word, hence my shock. Especially in a technical context where the connotation of petty abuse of judgement makes no sense.

Finally, there are cases where it's indistinguishable from arbitrary. For instance, if you're defining a pure computation on some integer, the distinction between arbitrary and random is meaningless.

Throwing the word away (as I interpret your sentiment to argue for) seems like the worst possible interpretation and just as likely to lead to confusion.

burkaman · on Oct 2, 2023

I don't think it means quite the same thing as random.

"The judge is giving out random sentences" means the judge is rolling a die or something, literally randomizing each sentence.

"The judge is giving out arbitrary sentences" means the judge is sentencing based on how they feel in morning, or the opinion of the last person they talked to. The decisions are not random, but they aren't based on any coherent set of rules or logical framework. The judge could have made a different decision and it would have made just as much (or just as little) sense.

Another common usage is calling something an "arbitrary distinction". For example, skyscrapers are often defined as buildings that are at least 100 meters tall. That is an arbitrary distinction, in that there is no significant difference between a 99 meter and 101 meter building. It's obviously not random, it was picked because 100 is a nice round number, but when we say it's arbitrary that means we could have drawn the line at any other number and it would have worked fine. In fact, some people define skyscrapers as being at least 150 meters tall, and there is no logical reason that either of these number are better. They are both arbitrary, and saying that "your 23-story building is a high-rise but my 24-story building is a skyscraper" is making an arbitrary distinction.

So back to Nate Silver's quote:

> I don’t think it’s quite right to say these decisions are arbitrary. Ideally they’ll reflect a statistician’s judgment, experience and familiarity with the subject matter.

If these decisions were arbitrary, that would mean the statistician isn't making educated choices. They're thinking "I just saw a cool article on this regression technique, let me try that", or "my favorite programming language is good at X technique", or "I've been wanting to practice Y technique". When asked why they made a particular decision, the statistician might not have a logical explanation.

If the decisions are not arbitrary, that would mean that other statisticians are likely to agree with the decision, or at least understand the logic behind it.

uoaei · on Oct 2, 2023

If I heard someone say that I would think they're making the point only that it was based on personal whim and not pure statistical chance. I wouldn't interpret it to mean that the person's whims are in any way reasonable, just that they're theirs.

staticfloat · on May 2, 2023

I really like the integration of lower-level memory control in a superset of Python. Trying to maintain compatibility with such a large and varied ecosystem is a daunting task, and has the opportunity to be very valuable for many people who don't want (or are unable) to move away from the Python ecosystem. Kudos to you all for tackling such a difficult problem! Good luck, I look forward to seeing how you guys help to increase efficiency across the board!

staticfloat · on May 2, 2023

It is a little disappointing that they're setting the bar against vanilla Python in their comparisons. While I'm sure they have put massive engineering effort into their ML compiler, the demos they showed of matmul are not that impressive in an absolute sense; with the analogous Julia code, making use of [LoopVectorization.jl](https://github.com/JuliaSIMD/LoopVectorization.jl) to automatically choose good defaults for vectorization, etc...

    julia> using LoopVectorization, BenchmarkTools, Test
           function AmulB!(C,A,B)
               @turbo for n = indices((C,B),2), m = indices((C,A),1)
                   Cmn = zero(eltype(C))
                   for k = indices((A,B),(2,1))
                       Cmn += A[m,k]*B[k,n]
                   end
                   C[m,n]=Cmn
               end
           end
           M = K = N = 144; A = rand(Float32, M,K); B = rand(Float32, K,N); C0 = A*B; C1 = similar(C0);
           AmulB!(C1,A,B)
           @test C1 ≈ C0
           2e-9*M*K\*N/@belapsed(AmulB!($C1,$A,$B))
    96.12825754527164

I'm able to achieve 96GFLOPs on a single core (Apple M1) or 103 GFLOPs on a single core (AMD EPYC 7502). And that's not even as good as what you can achieve using e.g. TVM to do the scheduling exploration that Mojo purports to do.

Perhaps they have more extensive examples coming that showcase the capabilities further. I understand it's difficult to show all strengths of the entire system in a short demonstration video. :)

EDIT: As expected, there are significantly better benchmarks shown at https://www.modular.com/blog/the-worlds-fastest-unified-matr... so perhaps this whole discussion truly is just a matter of the demo not showcasing the true power of the system. Hopefully achieving those high performance numbers for sgemm is doable without too much ugly code.

bufo · on May 2, 2023

Yeah I think no one will likely have any edge for a simple thing like a matrix multiplication since all the right abstractions are supported in both languages and they end up in the LLVM code gen. Having Python 3 backwards compatibility and easily deploying your code to, say, phones via a C++ API is quite big though.

zbowling · on May 5, 2023

It seems you are using N=144 where in the modular example they are doing N=1024. Significantly less computationally expensive of a calculation in this Julia example.

adgjlsfhk1 · on May 5, 2023

This is comparing Gflops which is a size normalized measurement. With N=M=K=1024, I get 127.5 Gflops on my laptop (intel i7-1185G7)

kaba0 · on May 4, 2023

But that’s on different hardware than what they used.

staticfloat · on Dec 8, 2022

> The Julia correctness issues are things like basic math functions, called in simple normal ways, returning wrong numbers.

If you have an example of this, I'd be interested in tracking it down. I didn't see this in Yuri's blog post, nor am I aware of any egregious examples of this right now.

canjobear · on Dec 8, 2022

I was thinking of the first example he links, with the Dirichlet density.

celrod · on Dec 8, 2022

That is a closed issue from a Julia package. I'm sure I could go to popular packages from any language and find bug reports.