Hacker Newsnew | past | comments | ask | show | jobs | submit | menaerus's commentslogin

Right? If this is really true, that some random folk without compiler engineering experience, implemented a completely new feature in ocaml compiler by prompting the LLM to produce the code for him, then I think it really is remarkable.

First, you have to prove it that it produced the copyrighted code. The question is what copyrighted code is in the first place? Literal copy-paste from source is easy but I think 99% of the time this isn't the case.

Information 30 years ago was more difficult to obtain. It required manual labor but in todays' context there was not much information to be consumed. Today, we have the opposite - a huge vast of information that is easy to obtain but to process? Not so much. Decline is unavoidable. Human intelligence isn't increasing at the pace advancements are made.

People do it with the autocomplete as well so I guess there's not that much of a difference wrt LLMs. It likely depends on the language but people who are inexperienced in C++ would be over-relying on autocomplete to the point that it looks hilarious, if you have a chance to sit next to them helping to debug something for example.

You know, I had a potential hire last week, and I was interviewing this one guy whose resume was really strong, it was exceptional in many ways plus his open-source code was looking really tight. But at the beginning of the interview, I always show the candidates the same silly code example with signed integer overflow undefined behavior baked in. I did the same here and asked him if he sees anything unusual with it, and he failed to detect it. We closed the round immediately and I disclosed no hire decision.

Does the ability to verbally detect gotchas in short conversations dealing only with text on a screen or white board really map to stronger candidates?

In actual situations you have documentation, editor, tooling, tests, and are a tad less distracted than when dealing with a job interview and all the attendant stress. Isn't the fact that he actually produces quality code in real life a stronger signal of quality?


It's bias and, from my experience, many people do not know how to assess the interviewee to extract his best. My example was luckily just a plastic example that sarcastically portrays how people nowadays are assessing LLM capabilities too. No difference.

Flamegraph is literally just a perl script that visualizes the stack traces collected by perf/dtrace (kernel). It's a good tool though but it doesn't need to be oversold for its capabilities, the hard work is done by the kernel. And honestly, many times it is not that useful at all and can be quite misleading, and not because of the bug in the tool but because how CPUs are inherently designed to work.

Everything is just a script with some visualization once you come up with the concept.

What concept in particular? There is nothing novel about that tool, it visualizes the stats collected by perf, and as I said it's not even that useful in root cause analysis in performance regressions, which is like the main point it is marketed for.

I am confused. Out of curiosity WDYM by 5 seconds being far longer than your entire build for most projects? That sounds crazy low.

It's not crazy, it's just what happens if you write mostly C with some conveniences where they actually make sense instead of "modern C++". I generally write very performance sensitive code, so it's naturally fairly low on abstraction, but usually most of my projects take between one and two seconds to build (that's a complete rebuild with a unity build, I don't do incremental builds). Those that involve CUDA take a bit longer because nvcc is very slow, but I generally build kernels separately (and in parallel) with the rest of the code and just link them together at the end.

Sure, C++ is heavy for compilation, there's simply more by the compiler to do, but code repository building under 5 seconds is at the very low end of tail so making the point about someone bearing with the 5 seconds longer build time is sort of moot.

I wrote a lot of plain C and a lot of C++ (cumulatively probably close to MLoC) and I can't remember any C code that would compile in such a short time unless it was a library code or some trivial example.


I can't comment on this particular implementation but few years back I played around with a similar idea, so not quite 1-to-1 mapping, but my conclusion derived from the experiments was the same - it is allocation-heavy. Code was built around similar principles, with type-erasure on top of future-promises (no coroutines back then in C++), and work-stealing queues. Code was quite lean, although not as feature-packed as folly, and there was nothing obvious to optimize apart from lock-contention, and dynamic allocations. I did couple of optimizations on both of those ends back then and it seemed to confirm the hypothesis of heavy allocations.

~20 tokens/second is actually pretty good. I see he's using the q5 version of the model. I wonder how it scales with the larger contexts. And the same guy published the video today with the new 3.2 version: https://www.youtube.com/watch?v=b6RgBIROK5o

He's talking about completely different type of risks and regulation. It's about the job displacement risks, security and misuse concerns, and ethical and societal impact.

https://www.youtube.com/watch?v=aAPpQC-3EyE

https://www.youtube.com/watch?v=RhOB3g0yZ5k


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: