More

peterhull90 · on Aug 29, 2024

In the HN comment that the article discusses [0] is the conclusion that commenter a1369209993 is correct (there are as many between 0 & 1 as 1 & +INF) and llm_trw is not correct? I got a bit confused.

Also, the article links to a blog by Daniel Lemire [1] in which he says (with regard to producing an unbiased random float) "picking an integer in [0,2^32) at random and dividing it by 2^32, was equivalent to picking a number at random in [0,1)" is incorrect and there is a ratio of up to 257:1 in the distribution so obtained. Not wanting to disagree with Daniel Lemire but I can't see why, and a quick experiment in Python didn't give this ratio.

[0]: https://news.ycombinator.com/item?id=41112688

[1]: https://lemire.me/blog/2017/02/28/how-many-floating-point-nu...

kccqzy · on Aug 29, 2024

The blog post explained it perfectly. There are 2^32 integers when you pick from [0,2^32). But there are 0x3f800000 floating point numbers between 0 and 1. And the former number is not divisible by the latter number. Therefore using division by 2^32 cannot be unbiased.

It's helpful if you first look at smaller examples. If we were to generate random integers in [0,10) by first generating random integers in [0,50) and then dividing by 5, that's valid. Exactly 5 numbers get mapped to one number each: the numbers [0,5) get mapped to 0, [5,10) get mapped to 1 etc. But what if you do the same division trick if you instead want to get numbers in [0,3)? Do you do the same division trick? Then the probability of the number 2 appearing is less than that of 0 or 1.

peterhull90 · on June 17, 2024

You can put an extra 'level' in, to make both options return the same 0,1...9

    def loop():
        for number in range(10):

            def outer(n):
                def inner():
                    return n

                return inner

            yield outer(number)

Is there a neater way?

nbadg · on June 17, 2024

Yes. The usual way to do this is to bind them as defaults to an argument, for example:

    def loop():
        for number in range(10):

            def func_w_closure(_num=number):
                return _num

            yield func_w_closure

This works because default arguments in python are evaluated exactly once, at function definition time. So it's a way of effectively copying the ``number`` out of the closure[1] and into the function definition.

[1] side note, closures in python are always late-binding, which is what causes the behavior in OP

amluto · on June 17, 2024

I’ve never thought that leaking this type of implementation detail into the return value (and return type!) was a nice solution. I like the double closure better, and one can shorten it a bit with a lambda.

For those who prefer a functional style, functools.partial can also solve this problem.

(I use Python, and I like a lot of things about Python, but I don’t like its scoping rules at all, nor do I like the way that Python’s closures work. I would use a double lambda.)

NwtnsMthd · on June 17, 2024

Does this work for user defined classes (objects) as well?

Jtsummers · on June 17, 2024

Yes, and by the same mechanism and for the same reason.

oulipo · on June 17, 2024

I would have just done something like

``` def loop(): for number in range(10): fixed_number = number def inner(): return fixed_number yield inner ```

Jtsummers · on June 17, 2024

Correctly formatted (two spaces preceding each line, one blank line before the first code line, no extra lines needed between code lines):

  def loop():
    for number in range(10):
      fixed_number = number
      def inner():
        return fixed_number
      yield inner

The output:

  eagerly = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
  lazily  = [9, 9, 9, 9, 9, 9, 9, 9, 9, 9]

It doesn't work because of the way Python's variables are scoped. Your fixed_number variable is still shared across all instances of inner. Python doesn't have any sort of block scoping like you seem to think it has.

peterhull90 · on May 31, 2024

Currently reading "Byzantium: The Surprising Life of a Medieval Empire" by Judith Herrin[0]

It's very interesting, a sort of missing piece of history from my point of view. At school those centuries was just skipped as 'the dark ages.'

[0]: https://en.wikipedia.org/wiki/Judith_Herrin

peterhull90 · on May 28, 2024

It may be different on newer MacOS but on mine you will need ping6

> ping6 ff02::1%en1

for example.

I wasn't able to get it to work on Windows (I can get the command to work in WSL but it seems to sit alone in an internal network). Any advice welcome.

lucb1e · on May 28, 2024

Wasn't it that you have to specify the interface with something like -i in Windows? Or was that Nmap maybe... I don't have Windows to test with here, but have you checked the --help or /? to see if it allows specifying an interface with a separate option?

It can also be that there are simply no other devices responding to this on your network, especially if those are also Windows (more infos/options in my other comment https://news.ycombinator.com/item?id=40503155)

g15jv2dp · on May 28, 2024

I'm on Windows and I have the same issue. I specified the interface with the % notation (it clearly does something, because if I put a wrong interface number, it gives me "general failure" instead of a timeout). The flag -i is for TTL, not interface.

However, if I do the same from my rpi, a few devices respond (my tv box, the rpi itself, etc)... including my windows machine!

According to https://superuser.com/a/467559, "The Windows ping program cannot send a ping request to broadcast addresses". That's multicast, there's no broadcast in ipv6; but maybe there's the same issue. I tried a quick https://learn.microsoft.com/en-us/dotnet/api/system.net.netw... and it times out too.

SCHiM · on May 28, 2024

The following works partially:

``` netsh interface ipv6 show interfaces ```

Get your interface id first, you're looking for the IDX number. There might be several.

ping ff02::1%LAN_INTERFACE_ID

So, example:

``` ping ff02::1%22 ```

Windows ping wrt the firewall is not very smart, it won't let the response packets through. So you need to disable your firewall to see systems responding.

Sadly, ping won't display the src address. It will state that "ff02::1%22" responded... But if you look in wireshark you can tell the other systems on your network received and responded to the packet.

akira2501 · on May 28, 2024

> wasn't able to get it to work on Windows

Windows has "ipv6-literal.net" as a helper. It's particularly useful if you're trying to smuggle link local addresses into a UNC path:

https://ipv6-literal.com/?ipv6=fe80%3A%3A3%251

g15jv2dp · on May 28, 2024

This doesn't help. Windows "ping" deals with ipv6 addresses just fine. It's the multicast part that seems to be the issue. If you try `ping ff02--1s20.ipv6-literal.net` (assuming your network interface is 20), the reply says that it's pinging ff02::1%20.

cchance · on May 28, 2024

or ping6 -I en0 ff02::1 on osx

peterhull90 · on May 8, 2024

When I started my first job in 1995, the old hands were still using DEC ALL-IN-1 [0] from terminals on their desks but it was in the process of being phased out (for Microsoft Mail or something on Windows for Workgroups), so I never got to use it myself.

[0]: https://en.wikipedia.org/wiki/ALL-IN-1

peterhull90 · on April 16, 2024

In my youth I had a ZX spectrum which had BASIC with line numbers and no renumber command. Sometimes when adding code I'd simply run out of line numbers so had to GO TO an unused block of line numbers, put the new code there and GO TO just after the original code. I've never quite recovered from that.

bregma · on April 16, 2024

My first job out of university was maintaining a FORTRAN IV program on a PDP-11. The only control structures in that language was IF..GOTO and the arithmetic GOTO. You can still write readable half-decent code with that, with discipline. A DO loop is still superior for readability.

The horrors of BASIC with its spaghetti of GO TO or its mess of PEEKs and POKEs are a justification for permabanning that style of programming -- but a decade of typing in listings from magazines inspired the generation the brought us the web and pocket phones. Maybe it wasn't such a bad thing after all.

pklausler · on April 16, 2024

You had a FORTRAN compiler that didn't have DO loops? I'm dubious.

shawn_w · on April 16, 2024

FORTRAN IV (1962) predates DO, which if Wikipedia is to be trusted, was introduced in FORTRAN 66.

pklausler · on April 22, 2024

FORTRAN (no numbers!) had DO loops in the first release of the language. No subroutines, only statement functions, but there absolutely were DO loops! The ability of the compiler to perform what are now basic loop optimizations on them is what sold the users on automatic compilers.

sdwr · on April 16, 2024

Reminds me of those logic games that operate in (virtual) physical space

https://store.steampowered.com/app/300570/Infinifactory/

The feeling being squished into a corner by your own rat's nest is really something.

mmaniac · on April 16, 2024

Reminds me of ROM hacking. Overwriting an instruction inside a function with a branch to unused memory and jumping back later is an easy way to get extra space for the patch you want to write.

peterhull90 · on April 10, 2024

It's a consequence of being block-based as mentioned elsewhere, but interesting to note that cat'ing together bzip2 files gives a valid bzip2 file. That's the basis of pbzip2 [0] - it breaks the input file into chunks of 900K by default, compresses each chunk and then concatenates the compressed chunks. The individual chunks can be compressed in parallel if hardware allows.

[0]: https://man.freebsd.org/cgi/man.cgi?query=pbzip2&apropos=0&s...

dspillett · on April 10, 2024

gzip isn't by default block based but does effectively support a dictionary reset command in the compressed stream. This “command” is essentially the start of the gzip header, so if you cat two bits of gzipped data together the result from decompressing the result is the same as the source data streams concatenated. This means you can turn gzip into a block-based process and therefore parallelise it in the same manner as bzip2, and this is how pigz⁰¹ works.

This dictionary reset trigger is how the “rsyncable”² option³ is implemented too. Resetting the compression dictionary this way every 1000 input bytes increases the size of the compressed output by surprisingly little⁴.

--

[0] https://zlib.net/pigz/

[1] I actually started making my own version of this, way back when, inspired by looking into how gzip's rsyncable option² worked, before discovering it already existed! I “finished” my version as far as a working PoC though as it was an interesting enough exercise.

[2] https://manpages.debian.org/bookworm/gzip/gzip.1.en.html#rsy...

[3] also supported by pigz⁰ where it is used within each block it compresses, though because it splits the input at regular intervals anyway (instead of a more dynamic approach) its output is naturally already more rsync compatible than plain gzip (though with the default 128KiB block size, notably less so than with the reset every 1000 input bytes)

[4] usually between 1% and 3% IIRC, depending on input content of course, for some inputs the difference could be lower than that range, or much higher

peterhull90 · on March 7, 2024

A role reversal from the 1930's - the first Canon cameras used Nikkor lenses!

https://www.mir.com.my/rb/photography/companies/nikon/htmls/...

peterhull90 · on March 6, 2024

Also Brian Anderson (brson) was/is a significant rust contributor

peterhull90 · on Feb 8, 2024

And could one swap between the two backends with the same VM image (.vbox +.vdi) to see which one gave the better performance?

blitzclone · on Feb 8, 2024