Hacker Newsnew | past | comments | ask | show | jobs | submit | scrollaway's commentslogin

ChatGPT, basically within 48 hours of its release.

While people were pointing out on Twitter how it couldn't do math right, I was turning arbitrary English instructions into JSON and brainstorming with my colleagues how we could have layers of verification in the stack. This felt different. We had all played with AI dungeon but suddenly, fully generalized systems were within reach.

A month later, we renamed our company and shifted its full focus on AI R&D. (https://ingram.tech/)


> Once built, what does one do with this?

Are you asking about the church or the lego set?


Agreed. The noise in tech circles often gets founders to conflate ten different things into a product that no longer makes sense. “Eu made alternative to Kagi”? Cool, we need European search engines, sign me up. “Privacy is such a priority we’re looking to accept cash by mail”? Okay, you’re never gonna build a serious competitor, never mind.

Yeah Mullvad that accept cash in the post are not a serious VPN provider at all, right.

You seem to not understand that a search engine and a VPN don't have the same audience, and certainly not the same needs for focus.

It's ok, we all have our flaws.


They are very comparable from a privacy standpoint IMO. A search engine and a VPN both get quite a lot of insight in your interests and browsing habits because a lot of browsing sessions start with a search.

> why can’t Mythos just fix all these issues itself if it’s so smart. And test them to make sure they work?

“Why”: because you didn’t ask it. It’s not its job in this case.

You don’t hire an accountant and tell them “why can’t you fix my cash-flow problems and make me money if you’re so smart”


Ah ok, sure. The difference being the model should know how to do both based on what I’ve been told.

So why didn’t Anthropic ask it for me?


Being charitable to Larry Ellison is one of those things one cannot physically do, like being entertaining to a dead whale.

Europe.

We fund science, research and we have accelerated programs for researchers affected by these kinds of things.

If you're interested, email me (see profile). I have been helping Americans emigrate to Europe (for free) for several years.


The cost improvements reached you, you just don't see them in the table quality.

You see them in the fact that every single home you'll visit to buy or rent has a fully equipped kitchen including a fridge, oven, likely a microwave, dishwasher and even a washing machine (which alone has a huge economic impact: https://www.youtube.com/watch?v=_gvsz_vc7B0)

You see it in the fact that your home is safer from fires than it ever has been. That hot water is a cheap passive thing you don't even think about, rather than something you have to plan for. That a TV is a nice add-on to it all, rather than a huge deal to get.

Your grandparents' table was more expensive because they had less things, and the massive wood table that they saved months for was what was kept and stood the test of time for you to see today. Because let's not forget, this is also what furnishing 100 years ago can look like:

https://hips.hearstapps.com/hmg-prod/images/furnishing-for-h...


The real problem is that today, you rarely can pay more to get better. If you pay 3x more for your appliances (TV, dishwasher, oven, etc...) you don't get something 3x more reliable/better engineered.

Because that requires manufacturers ready to give up stealth corner cutting as the cornerstone of their earnings in favour of the hard and long task of developing an image of reliability.

------

Three cases I know enough about: cars, loudspeakers and computer monitors.

You can still buy some Mazda/Toyota models to really get more thoughtful engineering and QC for your money, but the Germans with a similar image of quality (Mercedes, BMW) have partially or fully shed the underlying quality.

Genelec remains the only (non-PA) loudspeaker manufacturer you can sincerely trust to take reliability, performance and transparency seriously. There was also Klein + Hummel (K+H) but since being bought by Sennheiser and integrated with Neumann, things have been going downhill... to the point where some curious people found CapXon caps (bottom of the barrel) in their KH80s.

Computer monitors? Since Panasonic (Eizo's supplier of yore) exited the panel market and left it as LG vs Samsung, it's been a complete disaster. Oh, you wanna pay 1~2k $currency for a fancy OLED monitor? Get used to appalling panel QC (banding, uniformity), VRR flicker and DSC crap.

The available choice for "pay more to get better" continues to dwindle...


And when you do pay more, you're paying more to someone who has figured out how to make you think you are getting better quality, not to someone who is giving you better quality. This is the "market for lemons" effect.

"If you pay 3x more for your appliances (TV, dishwasher, oven, etc...) you don't get something 3x more reliable/better engineered."

You do at the bottom of unregulated markets. For dishwashers and ovens, safety regs generally impose a high floor on the market. There is no $40 oven, because it's physically impossible to make a safety-compliant oven for $40. If it weren't for market regulation, $40 death-trap ovens would be a thing for sure.

The very cheapest compliant unit isn't _much_ worse than a mid-market unit, it might be a bit flimsier and wear out sooner; high-end luxury units aren't much better than mid-market units - because there's not much innovation driving progress at the top end. AEG and Bosch are still generally solid engineering, but there's not much point in paying more than that unless you like the aesthetics.

Mercedes and BMW - small-volume performance models aside - are like the big fashion brands, Vuitton etc., they're selling the idea of luxury to people who aren't even nouveau-riche, more like borrowing money to cosplay loudly as nouveau-riche. Compare old 1970s Merc convertibles with today's, the modern ones are just kind of ugly, aggressive and sad.

ADAM Audio loudspeakers are pretty good or were last time I bought a pair. They're designed as studio monitors but great for listening too. Perhaps they've gone downhill since being bought by a listed company a few years ago?


>ADAM Audio loudspeakers are pretty good or were last time I bought a pair. They're designed as studio monitors but great for listening too. Perhaps they've gone downhill since being bought by a listed company a few years ago?

The Focusrite buyout (unless there was another after it) seem to have improved quality and transparency (i.e. publicly available official measurements for their current range). Still, performance remains lacking for the asking price of the A/S models; the A7V has a massive port resonance near 650 Hz, for example.

Interesting post about an old Adam engineer reminiscing about A5X issues: https://www.audiosciencereview.com/forum/index.php?threads/a...


Agreed; and more generally, Microsoft's online services in general are terrible. Their login system is a mess, their UX is awful... our company is a microsoft partner but there's like 27 different ways to be one, with a bunch of different accounts, forms and systems for it. Azure UX is atrocious. And this nonsense spills into every single enterprise product they offer too (how many people complain about Teams?).

Here in Belgium, 80% of enterprise accounts use MS over Google and I genuinely don't get why. (Without getting into the fiasco of not really having an EU alternative to either of those)


> Here in Belgium, 80% of enterprise accounts use MS over Google and I genuinely don't get why. (Without getting into the fiasco of not really having an EU alternative to either of those)

Maybe because those enterprises already used on-prem AD? It's much "easier" to have a hybrid monstrosity combining on-prem AD and Azure AD than on-prem AD and Google (or anything non-MS, really). Plus, MS is already a supplier, so for large, bureaucratic entities, they already have a foot in the door.


Wtf is your problem with high school dropouts exactly?


No problem.


Tell me you don’t know how AI works without telling me you don’t know how AI works.


What are you talking about?


I’ll try to steelman this comment. Anyone who uses coding tools knows that the output is heavily affected by details of the task you give it. The same model can give you garbage code or genius code for the same problem with slightly different framing. So it’s not necessarily a limitation in the model’s training that causes it to output security bugs. The model might be great at writing secure code, but you need a different harness to elicit that behavior.

Counterargument: just because the problem can be fixed without training, doesn’t mean training isn’t a possible solution.


Counter-counter-argument: for LLMs, tokens are units of thinking. And token use is, on the margin, directly proportional to costs of inference. So while the details of the harness, and how you prompt the model, and nature of the code and docs you put in context, etc. all matter to the quality of output you get from LLM coding tool, ultimately, there's always a ceiling to how much you're willing to spend on solving a problem - say, no more than 30 minutes, or $10, on refactoring a target module or implementing a small feature - and that puts a limit on how much thinking the model can put into it.

Thing is, writing secure and efficient and readable and simple code is in many cases fundamentally over that limit. It's possible, but you can't afford (or rationally just don't want) to spend as much on it as it's required for superhuman quality on all these aspects. Also most of the time, you don't want to operate at a limit - you probably expected that feature to take 30 seconds and less than $1 to implement. So you choose, both what the model optimizes for, and how much.

Because of that, no matter how good the model and the harness and the prompting are, $10 spent on coding is still bound to leave behind some security vulnerabilities that subsequent $10 spent on security review will find (especially with a model post-trained for that, at expense of general performance).


I guess I thought this should be obvious to everyone but, looking at code and finding exploits is completely different from .. writing exploits.

For one thing exploits often require completely different parts of the code to chain together. Sometimes parts of code the LLM itself isn’t writing.

And, LLMs are ALREADY trained negatively against writing buggy or exploitable code.


It's just an incremental thing. You're both right. They will slowly become less and less likely to introduce vulns due to higher intelligence and better RL. Offensive capabilities will still probably scale faster than automatic defensive-while-coding ones.


>I guess I thought this should be obvious

People in this thread are talking past and misunderstanding each other and making unrelated points.

The point of the response to the top level comment was questioning the conflict of interest in model providers creating separate revenue streams for themselves by selling a product that fixes problems their other product created, akin to OS providers selling anti-virus software back in the day.

Similarly, it should be obvious to you that a software engineer can trivially get into the mindset of writing more expoitable code by pretending the production code they're tasked with writing is hobby code or prototype code.

If profitable revenue streams with adverserial products are in place, no one should be surprised when model providers are disincentivised to improve the "garbage code quality, but hey it works!" nature of their most used code generators.

>And, LLMs are ALREADY trained negatively against writing buggy or exploitable code.

...it should also be obvious people in this forum have wildly different experiences with respect to the code quality the LLMs they use generate. I personally find it difficult to find anyone that argues that the LLMs they are using are consistently generating high-quality code across a vast codebase.


In every prompt: "write me code without exploitable bugs".

I know it doesn't work so easily as someone who uses AI for coding, but I do find repetition of basics in almost every prompt keeps the AI focused.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: