Hacker Newsnew | past | comments | ask | show | jobs | submit | philipLutz's commentslogin

Find and replace "Equity" with "1337"?


It would be interesting to see folks align on new words to mean the same thing.


I enjoyed this, but I am a little stuck on how stuck the author is on yield. If one ends up creating a large list of prime numbers with this generator, why not just return a large list of prime numbers? The author clearly notes how optimizing Python code is "odd" when you can more easily switch languages, but languages like C/C++/Rust would not have pretty code because it does not have yield. It is obvious that returning an array of prime numbers is what one would do in C/C++/Rust.

I understand that not using yield would mean it is not strictly a "generator", but the spirit of this problem is generating the sequence of primes (unless I am missing something).


> but languages like C/C++/Rust would not have pretty code because it does not have yield

Maybe I'm misunderstanding you, but Rust has native generators with yield: https://doc.rust-lang.org/beta/unstable-book/language-featur...

C++ also does using coroutines + co_yield (https://en.cppreference.com/w/cpp/language/coroutines#co_yie...) or iterators (old syntax).


Try solving some Project Euler problems. You'll see the win pretty quickly when you have to search a large range for primes.


Why would that be true?


The list you need to process is large enough that it would be prohibitive to put it all in memory at once. Furthermore you do not start knowing things like how many primes to look at. Therefore generating and dealing with them iteratively makes for simple code and a lot fewer headaches.


Why wouldn't someone just make a C++ class that keeps some state and call a method that gives them as many primes as they want, but runs 100x faster than python? Why would python yield somehow be better than this?


Project Euler is always about finding the right algorithm, and not raw performance.

With the right algorithm you can solve it in Python in a max of 30 seconds. With the wrong one, you sometimes can't solve it in C++ in the lifetime of the universe.

Therefore you should program in the language that you find it easiest to express yourself in. And not in a language chosen for raw speed.


This is what you said before:

You'll see the win pretty quickly when you have to search a large range for primes.

This isn't just shifting the goal posts, you are now talking about something completely different.

With the right algorithm you can solve it in Python in a max of 30 seconds

Then you can do the same thing in C++, but when it runs the C++ program will be done 100x faster. A lot of modern C++ is as clean as python if you were being more explicit about types.

With the wrong one, you sometimes can't solve it in C++ in the lifetime of the universe.

Nothing about this makes any sense.

Therefore you should program in the language that you find it easiest to express yourself in. And not in a language chosen for raw speed.

If you continue on with python and need more speed you will run around in circles trying to get it while the C++ version is already done. Anything where speed can be a bottleneck is not fit to be written in pure python.

This pattern plays out over and over again. If something is resource intensive, a scripting language is not a long term solution.


Are you actually familiar with Project Euler?

https://projecteuler.net/

Absolutely none of your arguments make sense in a context of a series of programming puzzles of fixed size. Back in the real world, there is a real need for exploratory programming on mathematical problems. I' prefer to use Python for that. If your problems are statistical, R is a better choice. C++ is a poor fit for exploratory programming.

C++ is a good choice for serious mathematical computing. But not always the best. https://www.hpcwire.com/2020/01/14/julia-programmings-dramat... shows the growing popularity of Julia, even for the most demanding computations.

Looking to the future, I would choose Rust over C++ for most new projects where people currently use C++. It seems bizarre to me that C++ evangelists can talk about type safety and ignore memory safety, when memory safety is a giant problem for security and correctness in any large program.

So, yeah. C++ is great for what it's great for. But if I'm going some exploratory back of the envelope calculations, I'm still reaching for Python. And yield is one of the reasons why.


C++ is a poor fit for exploratory programming.

It works fine for me. Are you sure this isn't just because you don't have a lot of practice with C++?

It seems bizarre to me that C++ evangelists can talk about type safety and ignore memory safety, when memory safety is a giant problem for security and correctness in any large program.

This has nothing to do with the current thread, but this isn't really a problem in modern C++. Are you you sure you are up to date with modern C++? Most programs don't look too different than python due to for iteration and auto type deduction.


sigh

I have no idea why you are trying to evangelize C++ here. I use a wide variety of languages for different purposes. For Project Euler type tasks I switched to Python about 15 years ago because of yield, and the convenience of using objects as hash keys. If you make different choices and they work for you, then have fun.


double sigh

You seem to be saying there is some advantage to python because of the yield feature and I just don't think that's true. You can keep state in a class and call a method. There are a lot of disadvantages though.

triple sigh


The generator is already maintaining a dictionary containing multiple lists that hold all the primes found so far. Even if it instead kept yielding a list of all the primes so far (rather than the most recent prime) it would take less than twice the memory usage.

I'd say the flexibility to not need to know how big of a prime you'll need is the main draw.


I obviously have a version without the memory issue.

The programming flexibility is indeed the win.

    def primes ():
        yield 2
        yield 3
        f = {}
        n = 5
        for p in primes():
            if p == 2:
                continue
            p2 = p*p
            while n < p2:
                if n in f:
                    double_p = f.pop(n)
                    m = n + double_p
                    while m in f:
                        m += double_p
                    f[m] = double_p
                else:
                    yield n
                n += 2
            f[n] = 2*p


This allocates a dictionary and the primes() generator for every single prime generated though, so it still has the memory issue, unless there's something I'm missing about how it works?


Put a debugging statement in and you'll verify that the dictionary only contains primes up to the square root of the one just generated.

To do that it recursively contains a second prime generator that only produces up to the square root. Which contains one that produces up to the fourth root. And so on until you get down to 3. The work and memory for those recursive generators is a rounding error compared to the initial prime generator. The work and memory saved by not keeping unnecessary primes is a significant win.


Oh, thanks for the explanation, I get how it works now. I hadn't seen this trick for an unbounded prime sieve before, and it's a nice idea.


> If one ends up creating a large list of prime numbers with this generator

But one does not always end up creating a large list of prime numbers, do they?

> why not just return a large list of prime numbers?

Because that requires upfront knowledge of the filtering factor or absence thereof.

A generator means you don't care, you generate an infinite sequence and the consumer is free to do whatever they want.

> It is obvious that returning an array of prime numbers is what one would do in C/C++/Rust.

It's certainly not obvious for Rust.


"I can't try to rewrite it in C++ or Rust for now, due to the lack of generator support; the yield statement is what makes this code so nice and elegant, and alternative idioms are much less convenient."


That is the context of the first sentence in your comment, pretty much the only part I left alone.

Everything I replied to is the opinion you personally expressed.

And note that the author had already preempted your comment in the first place:

> When we want a list of all the primes below some known limit, gen_primes_upto is great, and performs fairly well. There are two issues with it, though:

> We have to know what the limit is ahead of time; this isn't always possible or convenient.


If the problem is outputting the primes, then storing them in a list is not a necessary part of that problem.

When writing ‘fizz buzz’, would you expect to return an array of 100 strings, some of which are numbers, some ‘fizz’es and some ‘buzz’es?


It's a bit of a convolution, but let's say we've got some iterator of unknown size and we want to have a prime each iteration. We can do the following in python, which saves us from having to pre-compute the primes _or_ know how long `some_iterator` is.

primes = gen_primes() for item, prime in zip(some_iterator, primes):


This implementation runs until it crashes. So you need a way to output the values as you go, or they will be lost when you run out of memory. It also lets you control how many values are generated if you just want to see a few without actually using all the RAM.


there is no silver bullet


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: