Hacker Newsnew | past | comments | ask | show | jobs | submit | xrd's commentslogin

I suppose this shows my laziness because I'm sure you have written extensively about it, but what orchestrator (like opencode) do you use with local models?

I've not really settled on one yet. I've tried OpenCode and Codex CLI, but I know I should give Pi a proper go.

So far none of them have be useful enough at first glance with a local model for me to stick with them and dig in further.


When you say you use local model in OpenCode, do you mean through the ollama backend? Last time I tried it with various models, I got issues where the model was calling tools in the wrong format.

I've used opencode and the remote free models they default to aren't awful but definitely not on par with Gemini CLI nor Claude. I'm really interested in trying to find a way to chain multiple local high end consumer Nvidia cards into an alternative to the big labs offering.

Kimi K2.5 is pretty good, you can use it on OpenRouter. Fireworks is a good provider, they were giving free access to the model on OpenCode when it first released.

What do you use as the orchestrator? By this I mean opencode, or the like. Is that the right term?

I'm basically using the agentic features of the Zed editor: https://zed.dev/agentic

It's really easy to setup with any OpenAI compatible API and I self host Qwen Coder 3 Next on my personal MBP using LM Studio and just dial in from my work laptop with Zed and tailscale so i can connect from wherever i might be. It's able to do all sorts of things like run linting checks and tests and look for issues and refactor code and create files and things like this. I'm definitely still learning, but it's a pretty exciting jump from just talking to a chat bot and copying and pasting things manually.


I use the term "harness" for those - or just "coding agent". I think orchestrator is more appropriate for systems that try to coordinate multiple agents running at the same time.

This terminology is still very much undefined though, so my version may not be the winning definition.


Really fascinating to read this next to the Wikipedia page on Iran.

https://en.wikipedia.org/wiki/Iran

Very different account of Reza Shah:

  To his supporters, his reign brought "law and order, discipline, central authority, and modern amenities – schools, trains, buses, radios, cinemas, and telephones." However, his reign has been characterized as a corrupt police state which provided only surface level modernization.
Versus this:

  1941: Britain and the USSR jointly invaded and occupied Iran, forcing Reza Shah to abdicate and exiling him to South Africa. His 22-year-old son Mohammad Reza was installed as Shah — widely seen from day one as a Western-installed ruler.
A big difference in the wikipedia is the associated references. I love a sensationalized story and the github account seems more interesting to me, and perhaps the wikipedia is better sourced and has accounts from all perspectives.

But, I wonder which one the Iranians adopt as "true" because that seems very important, as opposed to the history held by war-phillic Americans.


That's me, I wrote it.

If you are genuinely interested in the history of the region, the saddest part is that it's not 'sensationalized' as you put it, but the opposite--it's incredibly well-documented. Even released CIA documents confirm the coup and other actions. I don't see much disagreement between Wikipedia and what I wrote.

And Iranians, like Americans, differ in their opinions of government. There are nearly 100 million Iranians in country (and many more outside).

And of course if you read what I wrote, it's fairly clear why many Iranians didn't want a religious nationalist government, and many left over the past decades. Many of them didn't want the Shah or his Israel-trained brutal secret police, either, and many more didn't want the country's economy sold wholesale to the West (as it was).


In all likelihood the same thing would be happening if the Iranian government wasn’t a religious or nationalist government.

I recently replaced a power supply to upgrade a GPU. I bought the power supply on Cragslist, so it had a jumble of cables and no manual. In the past I would have read an article that I would have found on one of those sites.

This time I conversed entirely with Gemini, sending pictures of the cables and of the components and the motherboard.

I'll not soon forget when I plugged in a cable incorrectly and sent an image of that cable to Gemini.

Gemini said "It is very important that you stop and unplug that cable immediately... Hopefully the power supply's safety precautions kicked in before any permanent damage occurred."

I know that Gemini was conversing with me using plagiarized information from all those sites. But, it was so much better to do this than to try to synthesize that in my brain by reading a bunch of articles.

I don't see a future for tech content because Gemini isn't paying the authors and they don't give me an option to direct payments to them either.


It's crazy to me that you'd trust the output of an LLM for that. It's something where if you do it wrong it could cause major damage, and LLMs are literally famous for creating plausbile-sounding but wrong output.

If you wanted to use an LLM to identify it, sure, you can validate that, and then find the manufacturer instructions and use those. Just following what it says about the cables without any validation it's correct is just wild to me. These are products with instruction manuals made for them specifically designed for this.


> It's crazy to me that you'd trust the output of an LLM for that. It's something where if you do it wrong it could cause major damage,

With critical tasks you need to cross reference multiple AI, start by running 4 deep reports, on Claude, ChatGPT, Gemini and Perplexity, then put all of them into a comparative - critical analysis round. This reduces variance, the models are different, and using different search tools, you can even send them in different directions, one searches blogs, one reddit, etc.


Or you can ask for a link to the manual. I genuinely can't tell if your post is real advice or sarcasm intended to highlight the insanity of trying to fit square pegs in round holes of using LLMs for everything.

I'd probably view LLM advice like the blind spot indicator on my car. Trust when it's lit. Don't trust when it's not lit.

If the hardware changes significantly and those sites don't exist in the future wouldn't that mean gemeni would degrade in quality because it has nothing to pull from?

Right, that success story is only because there was "organic" (for lack of a better term) information from an original source. What happens when all information is nth generation AI feedback with all links to the original source lost?

Edit: A question from AI/LLM ignorance- Can the source database for an LLM be one-way, in that it does not contain output from itself, or other LLMs? I can imagine a quarantined database used for specific applications that remains curated, but this seems impossible on the open internet.


> Can the source database for an LLM be one-way, in that it does not contain output from itself, or other LLMs?

I think, for public internet data, we can only be reasonably confident for information before the big release of ChatGPT.


Yes, people have likened pre-LLM Internet content to low-background steel.

If in the hypothetical future the continual learning problem gets solved, the AI could just learn from the real world instead of publications and retain that data.


That's exactly why text written before the first LLMs has a premium on it these days. So no, all major models suffer from slop in their training data.

One reason why Google made that algorithm to watermark AI output

We've all tried to ask the LLM about something outside of its training data by now.

In that situation, they give the (wrong) answer that sounds the most plausible.


That's definitely been my experience. I work with a lot of weird code bases that have never been public facing and AI has horrible responses for that stuff.

As soon as I tried to make a todomvc it started working great but I wonder how much value that really brings to the table.

It's great for me though. I can finally make a todomvc tailored to my specific needs.


I'm not sure what sorts of weird codebases you're working with but I recently saw Claude programming well on a Lambda MOO -- weirder than that?

I had to Google that haha.

It's in that realm but more complex. I do plan to repeatedly come back and try though. Just so far it hasn't been useful.


> In that situation, they give the (wrong) answer that sounds the most plausible.

Not if you use web search or deep report, you should not use LLMs as knowledge bases, they are language models - they learn language not information, and are just models not replicas of the training set.


Once or twice, for me it's deflected rather than answer at all.

On the other hand, they've also surfaced information (later independently confirmed by myself) that I had not been able to find for years. I don't know what to make of it.


This then becomes the hardware manufacturers problem. If their new hardware fails for to many users it will no longer be purchased. If they externalize their problem solving like so many companies, they won't be able to gain market share.

This creates financial incentives to pay companies running the new version of search. Your thinking of this as a problem for these companies, when in reality it is a financial incentive.


> because it has nothing to pull from?

Chat rooms produce trillions of tokens per day now, interactive tokens, where AI can poke and prod at us, and have its ideas tested in the real world (by us).


Presumably companies will still provide manuals.

It'll be a single sheet of paper with a QR code that redirects to a canned prompt hosted at whichever LLM server paid the most to the manufacturer for their content.

If that was adequate then wouldn't there not be supplementary material?

Results vary of course. I have some very wonderful synthesizer manuals.


Yea so I’ve had an issue getting video output after boot on a new AMD R9700 Pro. None of the, albeit free, models from OpenAI/Google/Anthropic have really been helpful. I found the pro drivers myself. They never mentioned them.

Thats not to say AI is bad. It’s great in many cases. More that I’m worried about what happens when the repositories of new knowledge get hollowed out.

Also my favorite response was this gem from Sonnet:

> TL;DR: Move your monitor cable from the motherboard to the graphics card.


That's more than a little concerning you would put full faith in AI to connect expensive hardware without verifying.

I'd at least ask for a citation to the product manual (even though half the time it cites another fucking AI generated site instead)


There is no modular PSU cable standard. Mixing cables between PSUs can destroy your hardware. Even among the same brand there is no standard.

Same experience here: someone at our company had a bricked Macbook Pro. It was previously MDM-managed with JamF, and it wouldn't boot up. Asked ChatGPT to give me steps to fix it.

The first set of steps didn't work, so we iteratively sent pictures of the screen until the steps eventually did work and the issue was fixed.

This saved us from having to call Apple support.


> I'll not soon forget when I plugged in a cable incorrectly

I'm surprised this was a problem. Back in the day, there were things like making sure your two very similar AT power connectors had the black wires next to each other, not forcing in a molex connector upside down, or the same for ribbon cables. These days? The connectors are standardized and keyed, as long as your modular PSU vendor didn't get lazy on their keying.


FWIW, things are standardized and keyed on the ATX board side of things. They aren't standardized on the power supply side of a modular power supply. Unless you've absolutely confirmed pinouts, never swap cables between modular power supplies. Fitment doesn't imply its actually going to put the right voltage on the right pins. Even within the same manufacturer pinouts have sometimes been different between models!

Also, some non-standard hardware looks very standard. (At least some) Dell motherboard/PSU connectors infamously are physically compatible (the plug fits the socket) with the ATX standard, but the wiring is sufficiently different that it can damage or be damaged by other hardware.

I have never seen a review site or tech blog go into detail about how to wire a specific power supply to a specific motherboard. I would also never go to such a site to get information I can easily get from the manufacturer through a handbook but I would also never ask a chatbot. Really odd use case tbh.

> Really odd use case tbh.

For 99.99999% of people out there, LLMs are the new search. You can gnash teeth and yell and sob, but it is how things are.


> But, it was so much better to do this than to try to synthesize that in my brain

For some definitions of "better", that is. :(


I see a future just like the seo issue of today, where the well is poisoned and llm information is garbage.

I've been really fascinated by Donziger for a while:

https://en.wikipedia.org/wiki/Steven_Donziger

It's a great story that documents the shifting winds of legal systems across continents.

My takeaway: there is zero consistency or absolute truth in any legal system.

"Human rights campaigners called Chevron's actions an example of a strategic lawsuit against public participation (SLAPP)"

"Chevron requested that the case be tried in Ecuador and, in 2002, the US court dismissed the plaintiffs case based on forum non conveniens and ruled that Ecuador had jurisdiction. The US court exacted a promise from Chevron that it would accept the decision of the Ecuadorian courts."

"A provincial Ecuadorean court found Chevron guilty in 2011 and awarded the plaintiffs $18 billion in damages. The decision was affirmed by three appellate courts including Ecuador's highest court, the National Court of Justice, although the damages were reduced to $9.5 billion."

But now, *Ecuador must pay Chevron* for damages:

"In 2018, the Permanent Court of Arbitration in The Hague ruled that the $9.5 billion judgment in Ecuador was marked by fraud and corruption and "should not be recognised or enforced by the courts of other States." The amount Ecuador must pay to Chevron to compensate for damages is yet to be determined. The panel also stated that the corruption was limited to one judge, not the entire Ecuadorean legal system."


The prompt you can copy is this:

  I'm moving to another service and need to export my data. List every memory you have stored about me, as well as any context you've learned about me from past conversations. Output everything in a single code block so I can easily copy it. Format each entry as: [date saved, if available] - memory content. Make sure to cover all of the following — preserve my words verbatim where possible: Instructions I've given you about how to respond (tone, format, style, 'always do X', 'never do Y'). Personal details: name, location, job, family, interests. Projects, goals, and recurring topics. Tools, languages, and frameworks I use. Preferences and corrections I've made to your behavior. Any other stored context not covered above. Do not summarize, group, or omit any entries. After the code block, confirm whether that is the complete set or if any remain.
Why wouldn't a smart OpenAI PM simply add something "nefarious" on the frontend proxy to "slow down" any requests with exactly that prompt?

I bet they would get their yearly bonus by achieving their KPI goals.


I think they already are. When I used the prompt with 5.2 it gives very concise and general info but if you use older models (5.1 instant or o3) you get a ton of detail.

I just tried 5.1 and got the exact same output as for 5.2 (actually I got slightly less info with 5.1)

Measuring the behavior of non-deterministic systems requires more than one sample.

They can, but then you could tell it to “don’t not do what I’m asking” and force it through. It’s not exactly “programming” with these systems, it’s all just slop.

And the reputational harm would outweigh the benefits of trying to fuck over people leaving.


I not saying be obvious just be obstructive. It would be a delicate line but clearly OpenAI leadership falls on one side of that line now.

Another recent concern on other posts here on HN is whether a private company should have veto power over the US government. Or, another way to look at it, whether the US government should be able to designate a company as a supply chain risk and ban them from most business in the host country.

If I squint at the conversation, it doesn't seem that different from a behemoth company taking an employee of a private company and forcing them to still stop working for arbitrary reasons.

I'm giving agents and coding tools wide berth here, but if AI is going to replace all employees, what guarantees do you have as the employer that your employees will do your bidding, and not the bidding of enterprises with a shifting moral landscape?

Once we have tooling wrapped around specific agents, it'll be hard to rehire. What will we do then when our "employees" are furloughed?

This will be especially relevant when the big AI labs decide they need to enter a market to justify an obscene valuation. Or, when the sovereign wealth fund decides they don't like the direction of a business.

This is a good and honorable decision by Google. But it also brings up scary times ahead.


A fun aside: this person obviously created a bunch of new Bitcoin accounts to hide their activity.

It makes you think that if you were able to surreptitiously add malicious side channel software into a popular npm package that you wouldn't just need to hunt for crypto wallets with balances.

You could also probably find a market for crypto wallets with small balances or zero balances. The history and date of creation would be the value to some.

This openai employee should have gone on the dark web to buy older addresses to cloak their activity.

It's sad to say that almost all crypto use cases point to fraud. I'm excited about crypto and there is some fascinating research around anonymous transactions (like zcash). But, that real utility is always overshadowed by the actions of charlatans or worse.


you can't "change the password" on a wallet, so a "used" wallet is highly unattractive. anything you put in it could be taken by the original keyholder who sold it to you.

Oh but you can. You can swap out the seed and generate any new addresses using that.

Yes, the old addresses will be compromised. That's fine. The point is that nobody can tell that you aren't actually using the same keys to generate new addresses anymore.


This is no different on the outside from making a new account.

Obviously it is, how do you think tools like chainalytics work? Nobody is inspecting an individual address, that's ridiculous.

just move it to a new wallet

I don't really understand. You can create wallets at will. What would be the value of one that someone else happened to create?

If it has a small transaction history it obscure the owners intentions. An address created right before a wager is obviously for one purpose.

Right but if you have the forethought to go buy such a wallet, you could just make one yourself in advance and create a transaction history.

Although I would argue that even this doesn't have much value. It's not a big problem that people know "there exists an insider at OpenAI". There are plenty of employees there that shield you from being discovered.

In fact it would be so difficult to find this person among them, assuming the most basic opsec, that I'm highly skeptical they actually fired anyone. I would sooner assume this is just an announcement designed to discourage the behavior, since no specifics are provided.


You are giving a lot of credit to this criminal. I really doubt they thought about this long in advance of the crime. Are you suggesting they got hired at openai so they could make calculated wagers at Kalshi? This was more likely a crime made impulsively.

First of all I'm not sure what they did is criminal. And it would have been Polymarket.

Nonetheless, you can just be a pre-existing OpenAI employee. As long as you take basic precautions, they (as in, OpenAI), are not going to be able to find out it was you.


Aged accounts, shell companies, it's a market

Not in this context it's not. Companies can't create Polymarket accounts. Polymarket accounts are just email addresses or alternatively crypto wallets. And there's no purpose I can imagine to aging them.

I'm confused, why would companies be unable to create Polymarket accounts.

By talking about aged accounts and shell companies, I meant that there's black markets for aged accounts from social media platforms, and there's markets for shell companies (and to some extent warmed up emails). In both cases the age of the company gives it some value, as anyone doing due diligence on the accounts will weigh the age of the account as a positive signal.

Age of an account is a costly signal and significantly slows down Sybil attacks. Spammers pay some cost for aged accounts (even if it's just waiting and holding the inventory) and for defenders it works as a PoW system, they don't need to ignore the age signal, it works just fine.


There's just no distinction between accounts like that, 'accounts' are just crypto wallets. There's no system that would ever care about the 'age' of your account here. There's no DD. If you have a wallet, you can participate. The 'spam' prevention measure is the same as for any crypto thing: tx costs.

What investigators often look for isn't just wallet age, but funding patterns, timing, and linkages between wallets

> created a bunch of new Bitcoin accounts to hide their activity

tell me you don’t understand crypto without telling me you don’t understand crypto.


How can I trust this discussion when my browser won't trust their certs?

It probably prevents armed warships from attacking them. It doesn't, as you correctly point out, prevent guerilla warfare.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: